OpenAIRE guidelines and broker service for repository managers - OpenAIRE #OAW2016 webinar


Published on

Presentation by Pedro Principe and Paolo Manghi at the OpenAIRE Open Access week webinar. Friday October 28, 2016. Webinar on Openaire compatibility guidelines and the dashboard for Repository Managers, with Pedro Principe (University of Minho) and Paolo Manghi (CNR/ISTI).

Published in: Science
  1. 1. OPENAIRE GUIDELINES & BROKER SERVICES FOR REPOSITORY MANAGERS “Webinar on the Openaire compatibility guidelines and the dashboard for Repository Managers” Friday October 28, 2016 at 12.00 CEST Paolo Manghi ISTI-CNR Pedro Príncipe UNIVERSITY OF MINHO
  2. 2. OPENAIRE GUIDELINES & BROKER SERVICES FOR REPOSITORY MANAGERS Pedro Principe & Paolo Manghi Webinar – Oct. 28 2016
  3. 3. Agenda 1. OpenAIRE infrastructure & content acquisition policy 2. OpenAIRE guidelines and tools for compatibility 3. Notification Broker service & Repository manager dashboard 4. OpenAIRE and repositories moving towards Open Science publishing #OAW2015 3
  4. 4. An Open Knowledge & Research Information Infrastructure  Foster and facilitate the shift of scholarly communication towards making science Open and Reproducible  Collaborative and participatory approach at European and Global level Research communitie s Research admins Researchers Funders SMEs Content providers in scholarly communication Neworking & e-Infrastructure
  5. 5. OpenAIRE’s e-infrastructure Commons Publications repositories Research Data repositories CRIS systems Registries (e.g. projects) OA Journals Software Repositories Validation Cleaning De-duplication Enrichment By inference Funders, research admins, research communities • Research impact • Project reporting and monitoring • Open Access trends Content providers • Repository guidelines and validation • Repository notification broker • Repository analytics and usage stats Researchers • Claim publications, datasets, software • Deposit publications, datasets, software • Search & browse: interlinked publications, datasets, projects • Open Access & DMP Helpdesk • End-User feedback CONTENT PROVIDERS INFO SPACE SERVICES KEY STAKEHOLDERS SERVICES Project initiative FunderFunding Result Publication Data Software Organization GUIDE LINES TERMS OF USE
  6. 6. OPEN ACCESS OpenAIRE implements the EC requirements & SUPPORTS THE OPEN DATA PILOT
  7. 7. Align OA policies. Sync infrastructures. OpenAIRE provides services for other funders (EU national funders) 7
  8. 8. 8 OpenAIRE Content Acquisition AUTHORITATIVE INFORMATION RESEARCH DATAPUBLICATIONS • Registries of Data Providers • OpenDOAR, • re3data, • DOAJ journal list, … • Funding Information • Author-/Contributor info OpenAIRE will expand the current policy to other dataset classes (Open Access datasets) because OpenAIRE would like to have some for quality certification.
  9. 9. INTEROPERABILITY: GUIDELINES & VALIDATOR Data providers 9 Common standards/best practices for data providers (Guidelines for literature, data repositories, aggregators, OA journals, CRIS systems). Validator: web service or standalone
  10. 10. OpenAIRE Guidelines • Common standards/best practices for data providers (Guidelines for literature, data repositories, aggregators, OA journals, CRIS systems). Validator: web service or standalone. • OpenAIRE has collaborated with key stakeholders and has produced three sets of guidelines for its data providers, all based on existing well- established standards. • Best practices for the use of transfer protocol (OAI-PMH), metadata formats, controlled vocabularies. 10 INTEROPERABILITY IS KEY
  11. 11. 1 2 3Literature Repositories (and journal platforms) Dublin Core (DRIVER) Data Repositories (and archives/data centres) Datacite CRIS systems CERIF-XML Guidelines for Data Providers 11
  13. 13. How do they work? • Identification of Open Access and funded research results by OAI-Sets: • ‘openaire’ for publications • ‘openaire_data’ for research datasets • Latest schema guarantees backward-compatibility with previous versions. • Complemented by metadata enrichment thanks to OpenAIRE’s text-mining services. 13
  14. 14. Literature Guidelines: OpenAIRE OAI-Set • To group metadata relevant for OpenAIRE • See policy/content-acquisition-policy • Metadata about Open Access Publications • Metadata about Publications funded in EC-FP7 / H2020 • Metadata about Publications funded by other funders • OpenAIRE provides information about supported funding information 14 setName setSpec* The OpenAIRE set OpenAIRE openaire
  15. 15. projectID 15 Element name projectID DCMI definition dc:relation Usage Mandatory (if applicable) Usage instruction A vocabulary of projects is exposed by the OpenAIRE API: , and available for all repository managers. Values include funder, project name and projectID. The projectID equals the Grant Agreement number, and is defined by the namespace: info:eu- repo/grantAgreement/Funder/ FundingProgram/ProjectNumber/ Jurisdiction/ProjectName/ProjectAcronym/ Example: <dc:relation> info:eu-repo/grantAgreement/EC/FP7/123456 </dc:relation> <dc:relation> info:eu-repo/grantAgreement/EC/FP7/12345/EU//Acronym </dc:relation>
  16. 16. accessRights 16 Element name accessRights DCMI definition dc:rights Usage Mandatory Usage instruction Use values from vocabulary Access Rights at AccessRights • info:eu-repo/semantics/closedAccess • info:eu-repo/semantics/embargoedAccess • info:eu-repo/semantics/restrictedAccess • info:eu-repo/semantics/openAccess Examples: <dc:rights> info:eu-repo/semantics/openAccess </dc:rights>
  17. 17. embargoEndDate Element name embargoEndDate DCMI definition dc:date Usage Mandatory (if applicable) Usage instruction Recommended when accessRights = info:eu- repo/semantics/embargoedAccess The date type is controlled by the name space info:eu- repo/date/embargoEnd/, see eu-repo/#info-eu-repo-DateTypesandvalue. Encoding of this date should be in the form YYYY-MM-DD (conform ISO 8601). Examples: <dc:date> info:eu-repo/date/embargoEnd/2011-05-12 <dc:date>
  18. 18. Alternative Identifier 18 Element name Alternative Identifier DCMI definition dc:relation Usage Recommended Usage instruction List alternative identifiers for this publication that are not the primary identifier (repository splash page), e.g., the DOI of publisher’s version, the PubMed/arXiv ID. The term is defined by info:eu- repo/semantics/altIdentifier info:eu- repo/semantics/altIdentifier/<scheme>/<identi fier> where <scheme> must be one of the following: ark,arxiv, doi, hdl, isbn, purl… Example <dc:relation> info:eu-repo/semantics/altIdentifier/doi/10.1234/789.1 </dc:relation>
  19. 19. Referenced Dataset 19 Element name Referenced Dataset DCMI definition dc:relation Usage Recommended Usage instruction Encodes links to research datasets connected with this publication. The syntax of info:eu- repo/semantics/dataset is: info:eu- repo/semantics/dataset/<scheme>/<identifier> where <scheme> must be one of the following: ark,arxiv, doi, hdl, isbn, purl… Example <dc:relation> info:eu-repo/semantics/dataset/doi/10.1234/789.1 </dc:relation>
  20. 20. Referenced Publication 20 Element name Referenced Publication DCMI definition dc:relation Usage Recommended Usage instruction Encode links to publications referenced by this publication. The syntax of info:eu- repo/semantics/reference is: info:eu- repo/semantics/reference/<scheme>/<identifier > where <scheme> must be one of the following: ark, arxiv, doi, hdl, isbn… Examples: <dc:relation> info:eu-repo/semantics/reference/doi/10.1234/789.1 </dc:relation>
  21. 21. OpenAIRE Compatibility Status: Levels and OAI Sets 21 OpenAIRE basic Only Open Access content via driver oai set OpenAIRE 2.0 EC funded content via ec_fundedresour ces oai set OpenAIRE 2.0 + Open Access and EC funded content via driver and ec_fundedresourc es oai set OpenAIRE 3.0 Open Access and/or EC funded and/or National/other funded content via openaire oai set
  22. 22. Meet H2020 OA Guidelines 22 Property DC Field Value EU funding acknowledgment dc:contributor “controlled” terms : ["European Union (EU)" and "Horizon 2020"]["Euratom" and "Euratom research and training programme 2014-2018"] Peer reviewed dc:type info:eu-repo/semantics/publishedVersion Embargo period dc:date dc:rights • info:eu-repo/date/embargoEnd/<YYYY-MM-DD> • <YYYY-MM-DD> (as publication date) • info:eu-repo/semantics/embargoedAccess Project information dc:relation info:eu- repo/grantAgreement/EC/H2020/[ProjectID]/[Jurisdiction]/[ProjectName]/[Project Acronym]/ Persistent identifier dc:identifier or dc:relation License dc:rights URL of license condition Persistent IDs for authors and contributors dc:creator dc:contributor <Lastname, Firstname; id_orcid 0000-0000-0000-0000> Reference to related research outcome dc:relation info:eu-repo/semantics/dataset/<scheme>/<id>
  23. 23. Sample DC-Record<dc:language>eng</dc:language> <dc:creator>Stanojević, Miloš</dc:creator> <dc:creator>Sima’an, Khalil</dc:creator> <dc:title>Evaluating MT systems with BEER</dc:title> <dc:subject>info:eu-repo/classification/ddc/400</dc:subject> <dc:source>The Prague bulletin of mathematical linguistics 104(1), 17-26(2015). doi:10.1515/pralin-2015-0010</dc:source> <dc:type>info:eu-repo/semantics/article</dc:type> <dc:type>info:eu-repo/semantics/publishedVersion</dc:type> <dc:publisher>Univ.</dc:publisher> <dc:date>2015</dc:date> <dc:rights>info:eu-repo/semantics/openAccess</dc:rights> <dc:coverage>DE</dc:coverage> <dc:identifier></dc:identifier> <dc:identifier> 07014%22</dc:identifier> <dc:identifier>http://publications.rwth- atical%20Linguistics%5D%20Evaluating%20MT%20systems%20with%20BEER.pdf</dc:identifier> <dc:audience>Researchers</dc:audience> <dc:relation>info:eu-repo/semantics/altIdentifier/doi/10.1515/pralin-2015-0010</dc:relation> <dc:relation>info:eu-repo/semantics/altIdentifier/issn/1804-0462</dc:relation> <dc:relation>info:eu-repo/semantics/altIdentifier/urn/urn:nbn:de:hbz:82-rwth-2016- 070142</dc:relation> <dc:relation>info:eu-repo/semantics/altIdentifier/issn/0032-6585</dc:relation> <dc:relation>info:eu-repo/grantAgreement/EC/H2020/645452</dc:relation> 23
  24. 24. CONTINUE TO BE DEVELOPED OpenAIRE guidelines to establish an open and sustainable scholarly communication infrastructure 24
  25. 25. OpenAIRE guidelines – future directions 25 OpenAIRE Guidelines for Data Archive Managers: * to comply with latest DataCite Metadata Schema 4.0 ( * new property FundingReference and subproperties funderName, funderIdentifier, awardNumber, awardURI OpenAIRE Guidelines for CRIS Managers * update of CERIF-XML (to make entities less normalized, e.g. to find all relevant metadata properties in the cf_publication entity; improved semantic vocabulary) - in collaboration and alignmnent with EuroCRIS OpenAIRE Guidelines for Literature Repository Managers * continues its principle to follow and adopt established standards * but will have changes in form of a new application profile * replacement of info:eu-repo by other controlled vocabularies and identifier systems, e.g * COAR Resource Type Vocabulary * recommends the use of ORCID for author identifiers * recommends the use of FundRef and ISNI identifiers for funding organizations * extends simple Dublin Core by attributes * includes specific metadata properties from latest DataCite metadata kernel * FundingReference * Creator and Contributor incl. subproperties creatorName, contributorName, nameIdentifier * file element(s) to locate the fulltext file(s) incl. mimeType and accessRights * includes optionally properties from the bibo ontology to express details of serials (volume, issue, startPage, endPage) * OpenAIRE will start consultation phase with LA Referencia, JISC and SHARE
  26. 26. Make Your Repository Available to OpenAIRE! 26
  27. 27. Becoming an OpenAIRE data provider 27 1. Register your repository in OpenDOAR / re3data * institutional/thematic repository -> OpenDOAR * data repository -> re3data 2. Test compliancy with OpenAIRE Guidelines Make your repository OpenAIRE compliant –> by help of the OpenAIRE validator service 3. Add your repository in OpenAIRE Register your repository in OpenAIRE; pre-filled information imported from OpenDOAR or re3data
  28. 28. 1. Registration in Repository directories • For literature repositories use: OpenDOAR ( ) • For research data repositories use: re3data ( ) • If you are already registered in OpenDOAR: • Check if the information is up to date • Take care on admin email contact and OAI configuration: baseURL, OAI-Set, Guidelines Compatibility 28
  30. 30. 2. Test the OpenAIRE Compliance 30 Choose from the menu Finally check results
  31. 31. 3. Registration Form 31
  33. 33. OpenAIRE compatibility: Addons, patchs or plugins for Repositories & Journals software Dspace add-ons and versions compliance OpenAIRE Plug-in (OpenAIRE 2.0) EPrints - OpenAIRE compliance example (3.0) OAI_DC_OpenAIRE implementation for Zenodo OJS Plugin: OpenAIRE + OJS DRIVER-Plug-In 33
  34. 34. Need to integrate project and funding information into your institutional repository based on DSpace or ePrints? • Go for the DSpace/ePrints endpoints. Do you prefer a TSV with the list of projects by funding? • TSV endpoint is meant for 34
  35. 35. Dspace Add-ons for project ids • OpenAIRE Authority Control • Dspace 3.2 • (updated March 2014) • Dspace 1.8.2 • apoio/remository?func=fileinfo&id=354 • OpenAIRE funders projects list addon (NEW) • In use on the RCAAP Project (PT repositories) • • Using the projects list provided by the OpenAIRE API 35 Allows users to search and include EC (+ WT + FCT) projects ID in the metadata of the records disposed in accordance with OpenAIRE
  36. 36. Submission Workflow Searching by the name or the project id number Select the project and accept… the necessary namespace will be filled
  37. 37. REDUCE WORKLOAD OF AUTHORS Repository managers to fulfill the EC Open Access requirements or other funders OA mandates 37
  38. 38. RESEARCHER DECIDES WHERE TO PUBLISH Check publishers policies on Open Access Journals Check for Article Processing Charges Subscription-based journal Self-archive in a repository Find at: IMMEDIATE OPEN ACCESS IMMEDIATE OR DELAYED OPEN ACCESS
  41. 41. Funded projects info in OpenAIRE Collect metadata including project grantID from OpenAIRE compliant repositories Metadata publications record enrichments by OpenAIRE deduplication Link Publications to projects by inference (text mining procedures) Link Publications to projects using the end-user service: claim publications
  43. 43. NOTIFICATION BROKER Repositories 43 (Meta)data and links exchange among different data providers.
  44. 44. Scenario • OpenAIRE aggregates metadata about publications from hundreds of repositories, aggregators, OA journals, and publishers • OpenAIRE guidelines: DC fields + access rights + funding projects + links to datasets or publications • Infers information about publications • Relationships to projects and datasets, citations, similarities • Find duplicates of metadata records for the same publications and merges them to build a (possibly richer) representative record 44
  45. 45. Idea • Institutional repositories are interested to acquire metadata that improves their collection of metadata records • Enrichment: enrich the records they already have with extra metadata information • Addition: add to their collection records that are “related with” the repository, i.e. they should/could be part of their collection 45
  46. 46. OpenAIRE Literature Broker sketch OpenAIRE Notification Broker OpenAIRE Information Space Graph (deduplication, Inference, Aggregation) … Subscriptions Potential Notifications subscribe notifyrepository admin OpenAIRE Data Sources Identifying “events” relevant to repositories (enrichments & additions) Sending events Delivered Notifications Event (potential notification): • Message • Topic • TargetRepository • Trust
  47. 47. The Challenge •Enrichment is straightforward • Harvesting from repository and return to repository its records if they have been “enriched” by deduplication and/or inference •Addition is less obvious • Based on relationships, in turn identified by inference algorithms • Must be augmented with notion of “trust” to enable “tuning” options in order to reduce false positive notifications 47
  48. 48. Examples of enrichments topics ENRICHMENT.[MORE | NEW] • dc:rights: dc:rights is present and original record was missing it • dc:identifier-if-DOI: DOI is present and original record was missing it • dc:type: dc:type is present and original record was missing it • dc:subject: dc:type is present and original record was missing it • rel-to-project: relationship to project is present and original record was missing it • rel-to-dataset/software/similar-publication: relationship is present 48
  49. 49. Examples of additions topics ADDITIONS • authorAffiliation: The publication has an author whose organization has a given institutional repository of reference • sharedProject: The publication has been funded by a project whose participants (orgs that are beneficiaries of the grant) have a given institutional repository of reference • authorRepositoryOfReference: The publication has an author with a given institutional repository of reference 49
  50. 50. Affiliation criterion Exploits relationships publication  author  organization  repository
  51. 51. Author’s repository of reference Exploits relationships Publication  author  repository (where author  repository is “frequency of deposition”)
  52. 52. Relevance by project funding Exploits relationships publication  project  organization  repository high chances to yield false positive notifications
  53. 53. Subscriptions • Repository managers can subscribe to the service to receive notifications about records “assigned to them” and specify • Topics: enrichment.[more | new].X or addition.Y • How to be notified: RSS feed, email, APIs, etc. • When to be notified: instantly, every K days • Criteria on record fields (predicate) • Repository managers can test their subscription by searching the collection of potential notifications 53
  54. 54. Notifications • The service can notify the repositories in different ways • OpenAIRE recommended repository APIs for metadata ingestion (e.g. SWORD project); software modules for known platforms will be considered (e.g. DSpace, Eprints) • email to the repository managers • RSS feeds • The service avoids redundant notifications by keeping a history of delivered notifications 54
  55. 55. Service architecture OpenAIRE Information Space Potential Notifications Subscriptions New Notifications by subscription Notifications Past Notifications by subscriptionPhase 1 Phase 2 Phase 3 Web Dashboard Test subscriptions Inspect notification history Manage subscription configs
  56. 56. Standards for brokers Working with similar initiatives (Jisc, SHARE-US) on the definition of recommendations to enable information exchange between a network of Scholarly Communication Broker Services Producers of events Subscriptions Subscriptions Subscriptions Consumers of events subscribe notify subscribe notify subscribe notify Exchanging Subscriptions & channeling notifications Exchanging Subscriptions & channeling notifications
  61. 61. Dashboard Functionalities •Data source registration and validation (against OpenAIRE guidelines) • Repository, data archive, journal, aggregator, CRIS system •Data source enrichment and fixing •Data source statistics •Data source usage statistics •Data source notification
  65. 65. Repository metrics: Local perspective
  66. 66. OpenAIRE to support Open Science Facilitate Research Communities adoption of Open Science publishing principles by supporting publishing tools as-a-Service Facilitate repositories at moving towards Open Science publishing by supporting notification-based research communication as-a-Service
  67. 67. Open Science as-a-Service (OSaaS) in OpenAIRE Catch-All-Notification Broker Methods Packages Articles DataProjects Research Community Dashboard Harvesting Search-Browse- Monitor-Research Impact Subscribe & Receive Notification Articles Data Researchers Content Providers Articles Data Projects Methods
  68. 68. Research Communication barriers to Open Science Repositories lack support to Open Science publishing No support for integration of repositories for methods or packages Minimal or no support for links between artefacts in different repositories No support for keeping repositories with up-to-date links between artefacts Research communities lack culture of Open Science publishing Lack of e-infrastructure for Open Science: e.g. repository limits above, exchange formats, workflows Difficulties to self-organize and sustain research communication solutions: e.g. identify the problems, see the benefits, devise solutions, apply economy of scale
  69. 69. Enabling a Network of Research Communication Brokers Producers of events Subscriptions Subscriptions Subscriptions Consumers of events subscribe notify subscribe notify subscribe notify Exchanging Subscriptions & channeling notifications Exchanging Subscriptions & channeling notifications
  70. 70. Repositories: Open Science benefits • Enabling addition of links to artefacts of any kind Extending repository metadata models to Open Science • “Almost real-time” exchange of information: notifications about links to other artefacts, missing properties, and missing artefacts Keeping their collection up-to-date: enrichments and additions • Enabling repositories to be notified of content of interest, enabling construction of research-focused aggregators by notifications Fostering notification-based and federated dissemination of knowledge
  71. 71. OpenAIRE towards Open Science Research Community Dashboard Repository Notification Broker Served on-demand according to the OSaaS approach Customizable by different disciplines and providers, each with different practices and maturity levels Framework aligning communities and repositories on practices addressing transparent evaluation and reproducibility
  72. 72. @openaire_eu Thanks! 73 Paolo Manghi & Pedro Principe