Finding out about the preservation of e-journals: an overview of the PEPRS project Fred Guy, EDINA, University of Edinburgh UKSG Conference 2011, 4 th  – 6 th  April 2011, Harrogate, Yorkshire
http://www.flickr.com/photos/sinclairlibrary/769777273/sizes/z/in/photostream/
Computer room in London School of Economics   1981 http://www.flickr.com/photos/lselibrary/4401344940/sizes/o/in/photostream/
Statistics related to e-journals RIN.  E-only scholarly journals: overcoming the barriers .  November 2010. 23% 3
Print – key aspects Once purchased is owned by the library and can be retained, transferred to remote store or disposed of when library determines this Library can check if other libraries hold the material and it can be consulted on the premises or be available via Inter-Library loan Likely that it will be available in a national library via legal deposit legislation (goes back to 17 th  century in UK)
E-journals: key aspects Libraries are licensed for usage – do not host the material  Control lies with the publisher rather than with the subscriber Publishers are not a constant in the life of a journal– titles are often transferred between publishers Publishers may decide that they do not want to host back material Legislation for legal deposit is not yet in place in UK and many other countries
Why a Preservation Registry? Many schemes emerging to meet challenge But who is doing what?  How can libraries & policy-makers assess which e-journals are being archived, by what methods, and under what terms of access? JISC commissioned a scoping study for an  e-journals preservation registry the idea had been mentioned in the literature
Scoping Study for a Registry
Scoping Study Report Precedes PEPRS Rightscom / Loughborough University, 2007 Confirmed expressed need among libraries and policy makers Warned of potential burden on digital preservation agencies  Recommended:  an e-journals preservation registry should be built UK Union Catalogue of Serials (SUNCAT)  or SHERPA (Open Access) get involved   SUNCAT is hosted and managed at EDINA
PROJECT DETAILS Phase 1 funded by JISC (Preservation Programme) from August 2008 – July 2010 EDINA, University of Edinburgh, grant recipient Project partner – ISSN International Centre, Paris Evaluation carried out by Charles Beagrie Limited for the JISC in February 2010
Digital Preservation Agencies in the Pilot * Two 3 rd  Party Organisations CLOCKSS  ( C ontrolled  L ots  O f  C opies  K eeps  S tuff  S afe) Portico * Two National Libraries  (c.f. legal deposit) British Library (BL) British Library e-Journal Digital Archive Koninklijke Bibliotheek (KB  e-Depot )   KB, National Library of the Netherlands * One library cooperative   UK LOCKSS ( L ots  O f  C opies  K eeps  S tuff  S afe)  Alliance
The Agencies - LOCKSS LOCKSS (Lots of Copies Keep Stuff Safe),  based at Stanford University Libraries, is an international community initiative that provides libraries with digital preservation tools and support so that they can easily and inexpensively collect and preserve their own copies of authorized e-content.
The Agencies - CLOCKSS CLOCKSS ( Controlled  LOCKSS)  is a not for profit joint venture between the world’s leading scholarly publishers and research libraries whose mission is to build a sustainable, geographically distributed dark archive with which to ensure the long-term survival of Web-based scholarly publications for the benefit of the greater global research community.
The Agencies - Portico Portico  provides libraries and publishers with a reliable, cost-effective solution to one of the most critical challenges facing the scholarly community today—ensuring that the electronic resources you rely on everyday will be accessible to future researchers, scholars, and students.
The Agencies – e-Depot The  e-Depot  is a digital archiving environment that ensures long-term access to digital objects. e-Depot  is based at the Koninklijke Bibliotheek in The Hague
The Agencies – British Library The  BL  preserves digital content that is collected but also material that is created, such as digitised collections. The store is an important component for forthcoming e-Legal Deposit.
What is in the vaults? http://www.flickr.com/photos/wka/4283285201 / http:// www.flickr.com/photos/mcfull/421644442/sizes/s/in/photostream /
http://www.flickr.com/photos/akeeh/4300472592/sizes/z/in/photostream/ Agency metadata Agency  metadata Agency metadata Agency metadata Agency metadata PEPRS
Creating the PEPRS database Agency data ISSN  Register ISSNs PEPRS ISSN-L + p-ISSN & e-ISSN Register metadata Agency metadata
Open Source components used in PEPRS Abstract Perl API supporting search and retrieval. Based on YAZ toolkit. ZOOM  http://zoom.z3950.org/api/ Z39.50 support in Perl Each preservation agency supplies custom data at the moment, so scripts will be created for each data source.  ISSN data is in MARC21 format and will be processed using MARC::Record CPAN package Custom Perl and CPAN packages including MARC::Record  http://search.cpan.org/~gmcharlt/MARC-Record-2.0.2/ Normalisation Data files will be collected using FTP and HTTP.  Custom Perl and CPAN packages Harvester Provides structured text indexing and retrieval. Fast and scales well. Provides powerful and flexible text retrieval capabilities.  Zebra  http:// www.indexdata.dk /zebra/ Database: metadata hosted by PEPRS Offers fast and easy development and is extremely flexible Apache::ASP  http://www.apache-asp.org /   User interface Comment Software choice Component
Beta service demonstration Beta service
 
 
 
 
 
 
 
 
 
 
 
 
PEPRS Phase 2 Funding provided from August 2010 – July 2012 Beta service – end of April 2011  www.peprs.org / Full service –2012 Involve international users in testing
Forthcoming functionality in 2011 Browsing Advanced searching features Machine to machine (for comparison work) and for OpenURL operations
PEPRS Phase 2: key stages Beta service – additional functionality Governance in operation Advisory group on governance ? Full service  Beta service - operation Beta service - preparation User testing and feedback Set up team of testers Aug-12 Apr-12 Dec-11 Aug-11 Apr-11 Dec-10 Aug-10 Activity
ISSN issues ISSNs missing in some agency records and some not in ISSN Register  Some duplicate records Some p-ISSNs used as e-ISSNs Some p-ISSNs linked via a common ISSN-L to a number of e-ISSNs but which one is correct? Some were incorrect
Holdings information - variation e-Depot : Preserved: v. 1 - 36, 38 - 46.  UK LOCKSS Alliance : Preserved: v. 42 - 45. In progress: v. 46, 47.  Portico : Preserved: (2002-2009) v.40, v.41, v.42, v.43, v.44, v.45, v.46, v.47 .
Terms used by preservation agencies
Involvement with international initiatives Print Archives Program  of the Center for Research Libraries – “ CRL is working with consortial partners to plan a prototype print archives framework to link existing print archiving efforts. has developed a searchable Print Archives Registry of information about print-archiving initiatives, including: Projects  Serial Holdings . HATHITrust –  “…. is committed to preserving the intellectual content and in many cases the exact appearance and layout of materials digitized for deposit. HathiTrust stores and preserves metadata detailing the sequence of files for the digital object” .
PEPRS: Further information and Contact details Project website http:// edina.ac.uk/projects/peprs/index.html Beta service  – to be available by 29 th  April http:// www.peprs.org / Fred Guy, EDINA, University of Edinburgh [email_address]

Piloting an E-journals Preservation Registry Service: overview of PEPRS

  • 1.
    Finding out aboutthe preservation of e-journals: an overview of the PEPRS project Fred Guy, EDINA, University of Edinburgh UKSG Conference 2011, 4 th – 6 th April 2011, Harrogate, Yorkshire
  • 2.
  • 3.
    Computer room inLondon School of Economics 1981 http://www.flickr.com/photos/lselibrary/4401344940/sizes/o/in/photostream/
  • 4.
    Statistics related toe-journals RIN. E-only scholarly journals: overcoming the barriers . November 2010. 23% 3
  • 5.
    Print – keyaspects Once purchased is owned by the library and can be retained, transferred to remote store or disposed of when library determines this Library can check if other libraries hold the material and it can be consulted on the premises or be available via Inter-Library loan Likely that it will be available in a national library via legal deposit legislation (goes back to 17 th century in UK)
  • 6.
    E-journals: key aspectsLibraries are licensed for usage – do not host the material Control lies with the publisher rather than with the subscriber Publishers are not a constant in the life of a journal– titles are often transferred between publishers Publishers may decide that they do not want to host back material Legislation for legal deposit is not yet in place in UK and many other countries
  • 7.
    Why a PreservationRegistry? Many schemes emerging to meet challenge But who is doing what? How can libraries & policy-makers assess which e-journals are being archived, by what methods, and under what terms of access? JISC commissioned a scoping study for an e-journals preservation registry the idea had been mentioned in the literature
  • 8.
  • 9.
    Scoping Study ReportPrecedes PEPRS Rightscom / Loughborough University, 2007 Confirmed expressed need among libraries and policy makers Warned of potential burden on digital preservation agencies Recommended: an e-journals preservation registry should be built UK Union Catalogue of Serials (SUNCAT) or SHERPA (Open Access) get involved SUNCAT is hosted and managed at EDINA
  • 10.
    PROJECT DETAILS Phase1 funded by JISC (Preservation Programme) from August 2008 – July 2010 EDINA, University of Edinburgh, grant recipient Project partner – ISSN International Centre, Paris Evaluation carried out by Charles Beagrie Limited for the JISC in February 2010
  • 11.
    Digital Preservation Agenciesin the Pilot * Two 3 rd Party Organisations CLOCKSS ( C ontrolled L ots O f C opies K eeps S tuff S afe) Portico * Two National Libraries (c.f. legal deposit) British Library (BL) British Library e-Journal Digital Archive Koninklijke Bibliotheek (KB e-Depot ) KB, National Library of the Netherlands * One library cooperative UK LOCKSS ( L ots O f C opies K eeps S tuff S afe) Alliance
  • 12.
    The Agencies -LOCKSS LOCKSS (Lots of Copies Keep Stuff Safe), based at Stanford University Libraries, is an international community initiative that provides libraries with digital preservation tools and support so that they can easily and inexpensively collect and preserve their own copies of authorized e-content.
  • 13.
    The Agencies -CLOCKSS CLOCKSS ( Controlled LOCKSS) is a not for profit joint venture between the world’s leading scholarly publishers and research libraries whose mission is to build a sustainable, geographically distributed dark archive with which to ensure the long-term survival of Web-based scholarly publications for the benefit of the greater global research community.
  • 14.
    The Agencies -Portico Portico provides libraries and publishers with a reliable, cost-effective solution to one of the most critical challenges facing the scholarly community today—ensuring that the electronic resources you rely on everyday will be accessible to future researchers, scholars, and students.
  • 15.
    The Agencies –e-Depot The e-Depot is a digital archiving environment that ensures long-term access to digital objects. e-Depot is based at the Koninklijke Bibliotheek in The Hague
  • 16.
    The Agencies –British Library The BL preserves digital content that is collected but also material that is created, such as digitised collections. The store is an important component for forthcoming e-Legal Deposit.
  • 17.
    What is inthe vaults? http://www.flickr.com/photos/wka/4283285201 / http:// www.flickr.com/photos/mcfull/421644442/sizes/s/in/photostream /
  • 18.
    http://www.flickr.com/photos/akeeh/4300472592/sizes/z/in/photostream/ Agency metadataAgency metadata Agency metadata Agency metadata Agency metadata PEPRS
  • 19.
    Creating the PEPRSdatabase Agency data ISSN Register ISSNs PEPRS ISSN-L + p-ISSN & e-ISSN Register metadata Agency metadata
  • 20.
    Open Source componentsused in PEPRS Abstract Perl API supporting search and retrieval. Based on YAZ toolkit. ZOOM http://zoom.z3950.org/api/ Z39.50 support in Perl Each preservation agency supplies custom data at the moment, so scripts will be created for each data source. ISSN data is in MARC21 format and will be processed using MARC::Record CPAN package Custom Perl and CPAN packages including MARC::Record http://search.cpan.org/~gmcharlt/MARC-Record-2.0.2/ Normalisation Data files will be collected using FTP and HTTP. Custom Perl and CPAN packages Harvester Provides structured text indexing and retrieval. Fast and scales well. Provides powerful and flexible text retrieval capabilities. Zebra http:// www.indexdata.dk /zebra/ Database: metadata hosted by PEPRS Offers fast and easy development and is extremely flexible Apache::ASP http://www.apache-asp.org / User interface Comment Software choice Component
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.
    PEPRS Phase 2Funding provided from August 2010 – July 2012 Beta service – end of April 2011 www.peprs.org / Full service –2012 Involve international users in testing
  • 35.
    Forthcoming functionality in2011 Browsing Advanced searching features Machine to machine (for comparison work) and for OpenURL operations
  • 36.
    PEPRS Phase 2:key stages Beta service – additional functionality Governance in operation Advisory group on governance ? Full service Beta service - operation Beta service - preparation User testing and feedback Set up team of testers Aug-12 Apr-12 Dec-11 Aug-11 Apr-11 Dec-10 Aug-10 Activity
  • 37.
    ISSN issues ISSNsmissing in some agency records and some not in ISSN Register Some duplicate records Some p-ISSNs used as e-ISSNs Some p-ISSNs linked via a common ISSN-L to a number of e-ISSNs but which one is correct? Some were incorrect
  • 38.
    Holdings information -variation e-Depot : Preserved: v. 1 - 36, 38 - 46. UK LOCKSS Alliance : Preserved: v. 42 - 45. In progress: v. 46, 47. Portico : Preserved: (2002-2009) v.40, v.41, v.42, v.43, v.44, v.45, v.46, v.47 .
  • 39.
    Terms used bypreservation agencies
  • 40.
    Involvement with internationalinitiatives Print Archives Program of the Center for Research Libraries – “ CRL is working with consortial partners to plan a prototype print archives framework to link existing print archiving efforts. has developed a searchable Print Archives Registry of information about print-archiving initiatives, including: Projects Serial Holdings . HATHITrust – “…. is committed to preserving the intellectual content and in many cases the exact appearance and layout of materials digitized for deposit. HathiTrust stores and preserves metadata detailing the sequence of files for the digital object” .
  • 41.
    PEPRS: Further informationand Contact details Project website http:// edina.ac.uk/projects/peprs/index.html Beta service – to be available by 29 th April http:// www.peprs.org / Fred Guy, EDINA, University of Edinburgh [email_address]

Editor's Notes

  • #2 Going to talk about a metadata project which provides a key role in informing librarians and collection managers about the situations regarding the long term situation about e-journals
  • #3 This is the old scenario where users were faced with row upon row of printed journals.
  • #4 Now users are inclined to access this information via computers although not quite like these ones!
  • #5 All the statistics point to increased e-journal publication, expenditure and usage. RIN has done a lot of work to quantify the situation. Chart one shows availability by discipline and clearly it is no great surprise to learn that the sciences have a very high %. The second chart shows that the bigger publishers have moved into online in a bigger way than the smaller publishers but it is increasing no matter the size. The third chart shows the increase in usage. There was a 23% increase in downloading between 2005/6 – 2006/07 and a 19% increase 2006-7. The increase is greater for the Scottish Higher Education Digital Libraries (SHEDL). 19.58% 2007-8 and 41.2% 2009-9.
  • #6 Print aspects. Essentially under library control.
  • #7 Essential aspect is that it is not under the control of the library as is the case with print.
  • #8 Schemes have emerged to provide solutions for libraries but there is a key issue in trying to obtain a coherent overall view.
  • #10 There is a lot of background literature but the key report is that prepared by Rigtscm and Loughborough University. Essentially PEPRS has evolved from the findings in the report.
  • #18 How do we find what is in the vaults but more critically how can we avoid having to look into each vault separately?
  • #19 This is a snow drop showing that PEPRS essentially is an aggregation of metadata from a number of suppliers or as they are called in the PEPRS context archiving agencies.
  • #20 The key components. Essentially PEPRS is based upon metadata from the different participating agencies associated with authoritative metadata from the ISSN Register.
  • #21 These are the open source components used in PEPRS.
  • #22 Demonstration of the beta service.
  • #23 The initial search screen.
  • #24 Google like search box.
  • #25 Results screen.
  • #26 Results screen.
  • #27 Bibliographical information and preservation information.
  • #29 Individual title showing bibliographical information together with information about the preservation status.
  • #30 Ready for archiving – a LOCKSS terminology.
  • #31 Help screen.
  • #32 Information about the Archiving agencies.
  • #33 FAQ
  • #34 HELP information.
  • #35 Some information about Phase 2.
  • #36 Functionality planned for the beta service.
  • #37 Key stages.