Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Developing Infrastructure to Support Closer
Collaboration of Aggregators with Open
Repositories
Dr. Nancy Pontika & Dr. Pe...
Mission of CORE
Aggregate all open access content distributed
across different systems worldwide, enrich this
content and ...
Need for a UK aggregator
Bringing the UK’s open access
research outputs together:
• Feasibility study commissioned
by Jisc...
Three levels of support
Programmable
Data Access
- CORE API
- CORE Data Dumps
- Researchers
- Developers
- Companies
Trans...
CORE Statistics
• Content: 20M+ records, 600+ repositories, 1.8M+
full-texts
• The UK national aggregator - Jisc
• Full-te...
Aggregation process
• Metadata download, extraction and cleaning
• Full-text harvesting
• Text extraction
• Language detec...
CORE Applications
• CORE Portal
– Search engine providing open access content
• CORE Mobile
– Android and iOS apps
• CORE ...
CORE Dashboard : purpose
• Harvested
Records
• Metadata
• Harvesting
Process
• Standards
• Repository
Managers
• Funders
•...
Institution main page
Edit repository information
Invitations
Content
Manage record visibility status
Take down
Manage record visibility status
Take down
Manage record visibility status
Take down
Take up
Manage record visibility status
Take down
Take up
Update metadata records
• Asynchronous process
• Item is queued in the CORE system
• Record is updated within 12 hours
Statistics
Issues : 3 types
When harvesting your repository/document we encountered an error that we couldn't
resolve. These errors n...
Issues : good news
Issues : good news
Issues : bad news…
Issues: Robots.txt
Issues: Robots.txt
Issues: Document Issues
Issues: Malformed PDF url
Dashboard benefits
- Increased and simplified collaboration between
aggregators and content providers
- Improved control o...
Would you like to take a look?
Dashboard still in BETA but we welcome
volunteer testers
Email me at nancy.pontika[at]open....
Many thanks to…
CORE developers:
• Matteo Cancellieri
• Samuel Pearce
• Drahomira Herrmannova
• Lucas Anastasiou
Volunteer...
Thank you
Questions
CORE Contacts:
Nancy Pontika nancy.pontika[at]open.ac.uk
Petr Knoth petr.knoth[at]open.ac.uk
Website: ...
Upcoming SlideShare
Loading in …5
×

Developing Infrastructure to Support Closer Collaboration of Aggregators with Open Repositories

584 views

Published on

Presented at LIBER Conference, London 24-26 June 2015

Published in: Education
  • Be the first to comment

  • Be the first to like this

Developing Infrastructure to Support Closer Collaboration of Aggregators with Open Repositories

  1. 1. Developing Infrastructure to Support Closer Collaboration of Aggregators with Open Repositories Dr. Nancy Pontika & Dr. Petr Knoth COnnecting Repositories (CORE) Open University, UK LIBER 2015, 24 – 26 June, London
  2. 2. Mission of CORE Aggregate all open access content distributed across different systems worldwide, enrich this content and provide access to it through a set of services … [Source: http://core.ac.uk/about#mission]
  3. 3. Need for a UK aggregator Bringing the UK’s open access research outputs together: • Feasibility study commissioned by Jisc, published June 2014 • Referred to as “Open Mirror” [Source : https://repository.jisc.ac.uk/5570/1/JISC_REPORT_open_mirror_09051 4_FINAL_WEB.pdf]
  4. 4. Three levels of support Programmable Data Access - CORE API - CORE Data Dumps - Researchers - Developers - Companies Transaction Information Access - CORE Portal - CORE Mobile - CORE Plugin - Researchers - Students - Life long learners Analytical Information Access - CORE Policy -CORE Compliance Analytics - CORE Dashboard - Funders - Governments - Data Providers [Source: http://www.dlib.org/dlib/november12/knoth/11knoth.html]
  5. 5. CORE Statistics • Content: 20M+ records, 600+ repositories, 1.8M+ full-texts • The UK national aggregator - Jisc • Full-text aggregator (not just metadata) • Placed among Top 10 search engines for research that go beyond Google [Jisc, 2013] • Listed among Top 100 Thesis and Dissertation Resources • Part of Jisc’s Repositories Shared Services Project (RSSP)
  6. 6. Aggregation process • Metadata download, extraction and cleaning • Full-text harvesting • Text extraction • Language detection • Extraction of citation references from text • Identification of related content • Detection of duplicate items • Parsing of author names • Indexing
  7. 7. CORE Applications • CORE Portal – Search engine providing open access content • CORE Mobile – Android and iOS apps • CORE Plugin – For repositories and journals • CORE API – Programmable access to million of resources • CORE Dashboard – Tool for repository managers
  8. 8. CORE Dashboard : purpose • Harvested Records • Metadata • Harvesting Process • Standards • Repository Managers • Funders • Repositories • Journals Data Providers Collaboration QualityTransparency
  9. 9. Institution main page
  10. 10. Edit repository information
  11. 11. Invitations
  12. 12. Content
  13. 13. Manage record visibility status Take down
  14. 14. Manage record visibility status Take down
  15. 15. Manage record visibility status Take down Take up
  16. 16. Manage record visibility status Take down Take up
  17. 17. Update metadata records • Asynchronous process • Item is queued in the CORE system • Record is updated within 12 hours
  18. 18. Statistics
  19. 19. Issues : 3 types When harvesting your repository/document we encountered an error that we couldn't resolve. These errors need to be fixed in order to to harvest your repository/document. We encountered an error but we were still able to harvest the repository/document. We strongly recommend that these issues are resolved as they may lead to incompatibility problems in the future. This may not be a problem but it may be a clue for misconfiguration or future incompatibilities.
  20. 20. Issues : good news
  21. 21. Issues : good news
  22. 22. Issues : bad news…
  23. 23. Issues: Robots.txt
  24. 24. Issues: Robots.txt
  25. 25. Issues: Document Issues
  26. 26. Issues: Malformed PDF url
  27. 27. Dashboard benefits - Increased and simplified collaboration between aggregators and content providers - Improved control of the content provider over the harvested content - Reduction of scepticism and fear of sharing content with other systems - Improvement of the harvesting process - Broadening of the open access content discoverability and thus reuse of the open access content where permitted
  28. 28. Would you like to take a look? Dashboard still in BETA but we welcome volunteer testers Email me at nancy.pontika[at]open.ac.uk
  29. 29. Many thanks to… CORE developers: • Matteo Cancellieri • Samuel Pearce • Drahomira Herrmannova • Lucas Anastasiou Volunteer testers: • Chris Biggs, Metadata & Repository Specialist, Open University • Nick Sheppard, Repository Developer, Leeds Beckett University
  30. 30. Thank you Questions CORE Contacts: Nancy Pontika nancy.pontika[at]open.ac.uk Petr Knoth petr.knoth[at]open.ac.uk Website: http://core.ac.uk Twitter: @oacore

×