PEER Conference
   29th May, 2012

   Andrew Dorward
Project overview
• Large-scale EC-funded project (€4.2 million, 3 year, 9 months) to model impact
  of Green OA on STM publishers
• 5 main consortium members: STM, ESF, Uni Göttingen, Max Planck Group,
  INRIA) plus input from SURF, Uni Bielefeld plus Advisory Board
• Involved 12 STM publishers (Elsevier, Wiley, Springer, IoP, OUP, Sage, BMJ,
  CUP, NPG, T&F et al); 6 repositories (mix of IRs – TCD, Uni Göttingen; national
  repositories – INRIA, KBN; subject repositories – MPG – no UK IRs involved.
• 53,000 pre-prints (ie Author Final Copy peer –reviewed) from 241 journals
  (tertiles 1,2,3, >40% EU-sourced content) placed in repositories and publisher
  websites and downloads measured
• Uni Göttingen/INRIA -developed DRIVER tools – PEER Depot, PEER
  Observatory – used for large-scale publisher deposit, and infrastructure created
• Appears to demonstrate that IRs and publisher sites can happily co-exist for
  Green OA


• Health warnings:
    • Driven by STM, so may erect further barriers to Green OA (Peter Suber)
    • Only provides a snapshot over a very limited time period with many
      variables, so should not be taken as conclusive proof – needs longer study
      (CIBER, Paul Ayris, UCL)
                                                                                     2
The Publisher’s View (STM, MPG)




                                  3
Implications for RepNet
• Like RepNet, PEER started out as Green-OA focussed, but overwhelming
  opinion from PEER study is that future direction will follow Gold route
• Consensus from PEER is that Green will exist as hybrid to cover switch in
  business processes to gold model
• The PEER Depot and Observatory infrastructure will support both green and
  Gold OA – next stage would be to build a financial infrastructure to support the
  business model (Uni Bocconi)
• For the RepNet project to be relevant in 12 months, we must develop an
  infrastructure and processes to support both Green and Gold OA models


• What we need to do:
    • Create a new WP to model and support a Gold OA infrastructure
    • Build on excellent work done by Uni Bocconi (Milan) to model cost drivers
      for publishers and repositories to work on requirements of funders




                                                                                     4
The Repository View (MPG)
• 11,400 invitations from publishers to authors to deposit pre-prints in IRs resulted
  in only 170 deposits – why?
• OA IRs are no threat to publishers, results show IR deposit complements
  publisher downloads, BUT IRs “not key to optional scholarly information delivery
  systems” (MPG) – why?
• Repositories should “concentrate on research data archiving and curation which
  is essential” (STM) – why?
• The PEER Depot is essential as a ‘clearing house’ for complex, unstructured
  content and as a ‘dark archive’ – a feature of Green OA of pre-prints


• What does this mean for RepNet?
    • Self-archiving will not drive Green OA as we want
    • Clear STM agenda for publishers to control content delivery, not IRs –
      should we support this?




                                                                                        5
The Infrastructure View (INRIA)
• PEER Depot and PEER Observatory based on DRIVER tools produced by
  SURF and Uni Göttingen
• PEER and DRIVER: PEER populates DRIVER repositories and DRIVER
  facilitates access for the user community – Uni Göttingen co-ordinates
• Infrastructure now stable, proven and can be used to support Gold OA with add-
  ons to support financial and business processes
• Immense technical/organisational problems around: formats (TEK, LATEK, etc);
  publisher processes; publisher metadata (NLM 2.0, 3.0, Scholar 1); metadata
  interchange standard (TEI); deposit (SWORD/SONEX); extracting metadata
  from PDF (Grobid) – caused project to overrun by 9 months and consequent 3-
  month ‘snapshot’ of download behaviour from publishers and repositories
• The PEER Depot functions – metadata consolidation and curation; embargo and
  withdrawal procedures; dark archive
• PEER Observatory functions – workflow, filtering


• What does this mean for RepNet?
    • RJ-Broker is vital component
    • Keep open mind on Open Depot



                                                                                   6

PEER End of Project Report

  • 1.
    PEER Conference 29th May, 2012 Andrew Dorward
  • 2.
    Project overview • Large-scaleEC-funded project (€4.2 million, 3 year, 9 months) to model impact of Green OA on STM publishers • 5 main consortium members: STM, ESF, Uni Göttingen, Max Planck Group, INRIA) plus input from SURF, Uni Bielefeld plus Advisory Board • Involved 12 STM publishers (Elsevier, Wiley, Springer, IoP, OUP, Sage, BMJ, CUP, NPG, T&F et al); 6 repositories (mix of IRs – TCD, Uni Göttingen; national repositories – INRIA, KBN; subject repositories – MPG – no UK IRs involved. • 53,000 pre-prints (ie Author Final Copy peer –reviewed) from 241 journals (tertiles 1,2,3, >40% EU-sourced content) placed in repositories and publisher websites and downloads measured • Uni Göttingen/INRIA -developed DRIVER tools – PEER Depot, PEER Observatory – used for large-scale publisher deposit, and infrastructure created • Appears to demonstrate that IRs and publisher sites can happily co-exist for Green OA • Health warnings: • Driven by STM, so may erect further barriers to Green OA (Peter Suber) • Only provides a snapshot over a very limited time period with many variables, so should not be taken as conclusive proof – needs longer study (CIBER, Paul Ayris, UCL) 2
  • 3.
  • 4.
    Implications for RepNet •Like RepNet, PEER started out as Green-OA focussed, but overwhelming opinion from PEER study is that future direction will follow Gold route • Consensus from PEER is that Green will exist as hybrid to cover switch in business processes to gold model • The PEER Depot and Observatory infrastructure will support both green and Gold OA – next stage would be to build a financial infrastructure to support the business model (Uni Bocconi) • For the RepNet project to be relevant in 12 months, we must develop an infrastructure and processes to support both Green and Gold OA models • What we need to do: • Create a new WP to model and support a Gold OA infrastructure • Build on excellent work done by Uni Bocconi (Milan) to model cost drivers for publishers and repositories to work on requirements of funders 4
  • 5.
    The Repository View(MPG) • 11,400 invitations from publishers to authors to deposit pre-prints in IRs resulted in only 170 deposits – why? • OA IRs are no threat to publishers, results show IR deposit complements publisher downloads, BUT IRs “not key to optional scholarly information delivery systems” (MPG) – why? • Repositories should “concentrate on research data archiving and curation which is essential” (STM) – why? • The PEER Depot is essential as a ‘clearing house’ for complex, unstructured content and as a ‘dark archive’ – a feature of Green OA of pre-prints • What does this mean for RepNet? • Self-archiving will not drive Green OA as we want • Clear STM agenda for publishers to control content delivery, not IRs – should we support this? 5
  • 6.
    The Infrastructure View(INRIA) • PEER Depot and PEER Observatory based on DRIVER tools produced by SURF and Uni Göttingen • PEER and DRIVER: PEER populates DRIVER repositories and DRIVER facilitates access for the user community – Uni Göttingen co-ordinates • Infrastructure now stable, proven and can be used to support Gold OA with add- ons to support financial and business processes • Immense technical/organisational problems around: formats (TEK, LATEK, etc); publisher processes; publisher metadata (NLM 2.0, 3.0, Scholar 1); metadata interchange standard (TEI); deposit (SWORD/SONEX); extracting metadata from PDF (Grobid) – caused project to overrun by 9 months and consequent 3- month ‘snapshot’ of download behaviour from publishers and repositories • The PEER Depot functions – metadata consolidation and curation; embargo and withdrawal procedures; dark archive • PEER Observatory functions – workflow, filtering • What does this mean for RepNet? • RJ-Broker is vital component • Keep open mind on Open Depot 6