Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.



Published on

Slides accompanying the OpenAIRE Research Graph consultation webinar as held on Janyary 30th 2020.
Presenter: Andrea Mannocci

Published in: Science
  • Be the first to comment

  • Be the first to like this


  1. 1. @openaire_euOpenAIRE-Connect Review 23rd of April, 2018 - Brussels The OpenAIRE Research Graph Bringing scholarly communication back into the hands of scientists AndreaMannocci,PaoloManghi InstituteofInformationScienceandTechnologies(ISTI) ConsiglioNazionaledelleRicerche(CNR)
  2. 2. Materializing the Open Science Graph Project communit y FunderFunding Product Publicatio n Researc h Data Software Organizatio n Source Other res. products Mining Deduplication End-user feedback Scientific product catalogue Harvesting GUIDE LINES Research Infrastructures Publishing IT
  3. 3. Providing an open metadata research graph of interlinked scientific products, with Open Access information, linked to funding information and research communities The OpenAIRE research graph Open Complete De-duplicated Transparent Participatory Decentralized Trusted
  4. 4. Complete: community-trusted sources Academic Graph … and more … and more … and more … and more … and more … and more
  5. 5. De-duplicated More information about the de-duplication framework used by OpenAIRE can be found searching on Zenodo for : • “De-duplicating the OpenAIRE Scholarly Communication Big Graph” (poster) • “GDup: De-Duplication of Scholarly Communication Big Graphs” Metadata records corresponding to equivalent objects are merged Scientific products Organizations
  6. 6. • Rely on quality scholarly communication sources of different kinds Participatory • Include solutions and content from any interested and known content provider in scholarly communication Institutional repositories Aggregators Data archives Software repositories Research infrastructure sources Funder grant databases Authors & Orgs entity registries Publishers & journals
  7. 7. • Metadata in the graph includes provenance when harvested and reliability indicators (trust) when obtained from mining Transparent
  8. 8. • Preservation and ownership beyond OpenAIRE Exchanged with other graph initiatives Broker Service: redistributed via subscription and notification to contributing data sources ( • Openly accessible via APIs ( Decentralized
  9. 9. • Authors in the loop to enrich their ORCID record • Validation of end-user “claims” • ORCID member since December 2019 Trusted
  10. 10. Populating the graph
  11. 11. Harvesting: Revised Classification of Research Products Publications • Article • Preprint • Report • … Datasets • Dataset • Collection • Clinical Trials • … Software • Research Software • … Other Research Products • Service • Workflow • Interactive Resource • … Institutional/ publication repositories Journals/ publishers Data repositories Other Products repositories Software repositories
  12. 12. Open Science publishing Bridging RIs and Scholarly Communication Transparency and reproducibility e-Infrastructures and Research Infrastructures Scholarly Communication infrastructure Dataset Method Thematic Service Dataset Experiment Publishing the experiment Input Dataset Input Method Output Dataset Experiment product Thematic Service Parameters Experiment repo Research data, Software, Workflows, Publications Data repo Method repo Publications IT Harvesting
  13. 13. • EPOS Research Infrastructure Reproducibility Transparency Seamless publishing Open Science publishing workflows
  14. 14. Pre-processed sources publication-dataset links 480Mi bilateral links CrossRef enriched 85Mi publication records DOIBoost Academic Graph Up-to-date every 6 months
  15. 15. Information propagation Product Source Country Project Organization communit y Product Project Source Publicatio n Project Dataset supplementedBy fundedBy hostedBy (institutional repository) located Funder funds (National Funder) fundedBy jurisdiction located ofInterestofInterest fundedBy hostedBy Product supplementedBy 157K 8Mi 10K
  16. 16. Production: Open Access CAPs BETA: Open Science CAPs 0 10000000 20000000 30000000 40000000 50000000 60000000 70000000 80000000 90000000 100000000 Old CAP New CAP literature 0 2000000 4000000 6000000 8000000 10000000 12000000 Old CAP New CAP research data 0 20000 40000 60000 80000 100000 120000 140000 Old CAP New CAP software 0 500000 1000000 1500000 2000000 2500000 3000000 3500000 4000000 4500000 Old CAP New CAP other 110Mi 30Mi 1Mi 10Mi 100K 180K 3Mi 7.5Mi Harvested content • Data sources 10K + • Records ~ 340Mi (120Mi coming from BASE) • Publication full-texts ~ 12Mi (3,5Mi coming from Springer N.) • Links (also text-mined) ~ 960Mi PROD BETA PROD BETA PROD BETAPROD BETA
  17. 17. Microsoft Research (being finalised) Unpaywall (ongoing) ORCID membership (ongoing) RDA IG Open Science Graphs for FAIR Data FREYA, ResearchGraph, OpenCitations, Open Knowledge Research Graph IG Session at RDA Helsinki 2019 (15th of October 2019) Liaisons Academic Graph
  18. 18. • Open Consultation - Phase 1 (Dec 2019) • Open Consultation - Phase 2 (Jan – Feb 2020) • OpenAIRE Research Graph (Spring 2020) Collecting feedback via Roadmap to production
  19. 19. Trello for for feedback
  20. 20. Thank you! Andrea Mannocci