SlideShare a Scribd company logo
1 of 52
CORE: Aggregating and Enriching
Content to Support Open Access
            Petr Knoth
        The Open University




              1/52
Outline
1. Aggregating Open Access (OA) publications – why, how, what
   for?
2. The CORE system
3. Supporting research in mining databases of scientific
   publications (DiggiCORE)




                              2/52
Outline
1. Aggregating Open Access (OA) publications – why, how, what
   for?
2. The CORE system
3. Supporting research in mining databases of scientific
   publications (DiggiCORE)




                             3/52
Growth of items in Open Access repositories




                         4/52
Growth of Open Access repositories




                         5/52
Growth of articles in OA journals




                           6/52
Growth of OA journals




                        7/52
Green Open Access - statistics




                       8/52
Why we need aggregations?
“Each individual repository is of limited value for research: the real
power of Open Access lies in the possibility of connecting and tying
together repositories, which is why we need interoperability. In
order to create a seamless layer of content through connected
repositories from around the world, Open Access relies on
interoperability, the ability for systems to communicate with each
other and pass information back and forth in a usable format.
Interoperability allows us to exploit today's computational power so
that we can aggregate, data mine, create new tools and
services, and generate new knowledge from repository content.’’
                                                   [COAR manifesto]


                                9/52
Access to information according to the level of abstraction




                  Metadata Transfer
                   Interoperability


                                      Metadata



                                                                         OLTP
                                                                                                  Analytical



                                                 Semantic Enrichment
Repository
                                                                                             information access




                                                                                Interfaces
                                         Aggregation
                                                                                                 Transaction
  Repository                                                                                 information access
                                      Content



                                                                         OLAP



                                                                                              Raw data access
Repository


                                                                       10/52
Who should be supported by aggregations?

The following users groups (divided according to the level of
abstraction of information they need):
   •   Raw data access.
   •   Transaction information access.
   •   Analytical information access.




                                    11/52
Who should be supported by aggregations?

• The following users groups (divided according to the level of
  abstraction of information they need):
   •   Raw data access. Developers, DLs, DL researchers, companies …
   •   Transaction information access. Researchers, students, life-long learners …
   •   Analytical information access. Funders, government, bussiness intelligence
       …




                                     12/52
Layers of an aggregation system


                                Interfaces

                 OLTP                           OLAP

                                  Enrichment

              Metadata                          Content

   Metadata Transfer Interoperability




                                        13/52
Layers of an aggregation system
                   APIs (REST, SOAP, XML-RPC), UIs, Dashboards    Statistics


                                Interfaces

                 OLTP                                OLAP

                                  Enrichment
                                                                 Catalog records
              Metadata                               Content

   Metadata Transfer Interoperability
                                                                   Annotations

    OAI-PMH, OAI-ORE …             Dublin Core, XML, RDF …       PDF, Word …


                                        14/52
Access to information according to the level of abstraction




                  Metadata Transfer
                   Interoperability


                                      Metadata



                                                              OLTP
Repository                                                                             Analytical
                                                                                  information access




                                                                     Interfaces
                                                 Enrichment
                                                                                      Transaction
  Repository                                                                      information access
                                      Content



                                                              OLAP


                                                                                   Raw data access
Repository


                                                          15/52
Related systems




     16/52
Aggregation projects – BASE



               Metadata Transfer
                Interoperability


                                   Metadata



                                                           OLTP
Repository                                                                          Analytical
                                                                               information access




                                                                  Interfaces
                                              Enrichment
                                                                                   Transaction
  Repository                                                                   information access
                                   Content



                                                           OLAP


                                                                                Raw data access
Repository


                                                       17/52
Aggregation projects – OAISter/WorldCAT



               Metadata Transfer
                Interoperability


                                   Metadata



                                                           OLTP
Repository                                                                          Analytical
                                                                               information access




                                                                  Interfaces
                                              Enrichment
                                                                                   Transaction
  Repository                                                                   information access
                                   Content



                                                           OLAP


                                                                                Raw data access
Repository


                                                       18/52
Aggregation projects – RepUK



               Metadata Transfer
                Interoperability


                                   Metadata



                                                           OLTP
Repository                                                                          Analytical
                                                                               information access




                                                                  Interfaces
                                              Enrichment
                                                                                   Transaction
  Repository                                                                   information access
                                   Content



                                                           OLAP


                                                                                Raw data access
Repository


                                                       19/52
Aggregations need access to content, not just metadata!

• Certain metadata types can be created only at the level of the
  aggregation
• Certain metadata can be changing in time
• Ensuring content:
   • accessibility
   • availability
   • validity
   • quality
   • …



                               20/52
Aggregation projects – CiteSeerX



               Metadata Transfer
                Interoperability


                                   Metadata



                                                           OLTP
Repository                                                                          Analytical
                                                                               information access




                                                                  Interfaces
                                              Enrichment
                                                                                   Transaction
  Repository                                                                   information access
                                   Content



                                                           OLAP


                                                                                Raw data access
Repository


                                                       21/52
Should an aggregation system support all three user types?

            Can be realised by more than one system
                          providing that
                    the dataset is the same!




                             22/52
Outline
1. Aggregating Open Access (OA) publications – why, how, what
   for?
2. The CORE system
3. Supporting research in mining databases of scientific
   publications (DiggiCORE)




                              23/52
CORE objectives
• CORE aims to provide a comprehensive technical infrastructure
  for Open Access scholarly publications that will support access
  and reuse of scholarly materials at different levels of abstraction.
• A nation-wide aggregation system that will improve the discovery
  of publications stored in British Open Access Repositories (OARs).




                                24/52
What does CORE provide at different aggregation levels?




                 Metadata Transfer
                  Interoperability


                                     Metadata



                                                             OLTP
Repository                                                                            Analytical
                                                                                 information access




                                                                    Interfaces
                                                Enrichment
                                                                                     Transaction
  Repository                                                                     information access
                                     Content



                                                             OLAP


                                                                                  Raw data access
Repository


                                                         25/52
CORE functionality




                     26/52
CORE functionality
Step 1: Metadata and full-text harvesting



                       Content harvesting, processing




                                    27/52
What does CORE provide at different aggregation levels?
                                                                    Semantic similarity, Citation
                                                                    extraction, classsification, …



                 Metadata Transfer
                  Interoperability


                                     Metadata



                                                             OLTP
Repository                                                                                Analytical
                                                                                     information access




                                                                      Interfaces
                                                Enrichment
                                                                                         Transaction
  Repository                                                                         information access
                                     Content



                                                             OLAP


                                                                                       Raw data access
Repository


                                                         28/52
CORE functionality
Step 2: Semantic enrichment




                                      Semantic enrichment




                              29/52
What does CORE provide at different aggregation levels?




                 Metadata Transfer
                  Interoperability


                                     Metadata



                                                             OLTP
Repository                                                                            Analytical
                                                                                 information access




                                                                    Interfaces
                                                Enrichment
                                                                                     Transaction
  Repository                                                                     information access
                                     Content



                                                             OLAP


                                                                                  Raw data access
Repository


                                                         30/52
CORE functionality
Step 3: Providing a set of services on top of the aggregation




                        Providing services




                                    31/52
CORE applications

 •   CORE Portal
 •   CORE Mobile
 •   CORE Plugin
 •   CORE API
 •   Repository Analytics




                            32/52
What does CORE provide at different aggregation levels?




                 Metadata Transfer
                  Interoperability


                                     Metadata



                                                             OLTP
Repository                                                                            Analytical
                                                                                 information access




                                                                    Interfaces
                                                Enrichment
                                                                                     Transaction
  Repository                                                                     information access
                                     Content



                                                             OLAP


                                                                                  Raw data access
Repository


                                                         33/52
CORE Applications
CORE Portal – Allows searching and navigating scientific publications
aggregated from Open Access repositories




                                   34/52
CORE Applications

CORE Mobile – Allows searching and
navigating scientific publications
aggregated from Open Access
repositories




                                35/52
CORE Applications
CORE Plugin – A plugin to system that recommendations for related
items.




                                 36/52
What does CORE provide at different aggregation levels?




                 Metadata Transfer
                  Interoperability


                                     Metadata



                                                             OLTP
Repository                                                                            Analytical
                                                                                 information access




                                                                    Interfaces
                                                Enrichment
                                                                                     Transaction
  Repository                                                                     information access
                                     Content



                                                             OLAP


                                                                                  Raw data access
Repository


                                                         37/52
CORE Applications
CORE API – Enables external systems and services to interact with the
CORE repository.




                                  38/52
What does CORE provide at different aggregation levels?




                 Metadata Transfer
                  Interoperability


                                     Metadata



                                                             OLTP
Repository                                                                            Analytical
                                                                                 information access




                                                                    Interfaces
                                                Enrichment
                                                                                     Transaction
  Repository                                                                     information access
                                     Content



                                                             OLAP


                                                                                  Raw data access
Repository


                                                         39/52
CORE Applications
Repository Analytics – is an analytical tool supporting providers of
open access content (in particular repository managers).




                                   40/52
What does CORE provide at different aggregation levels?

                                                                    Repository Analytics


                 Metadata Transfer
                  Interoperability


                                     Metadata



                                                             OLTP
Repository                                                                              Analytical
                                                                                   information access




                                                                     Interfaces
                                                Enrichment
                                                                     CORE Portal, CORE
                                                                     Mobile, CORE Plugin
                                                                                      Transaction
  Repository                                                                      information access
                                     Content



                                                             OLAP
                                                                                   CORE API

                                                                                    Raw data access
Repository


                                                         41/52
CORE statistics
• Content
   • 5.4M records
   • 192 repositories
   • 402k full-texts
• Started: February 2011
• Budget: 140k£




                           42/52
Outline
1. Aggregating Open Access (OA) publications – why, how, what
   for?
2. The CORE system
3. Supporting research in mining databases of scientific
   publications (          )




                              43/52
Partners




Advisory Board



                 44/52
Objective


Software for exploration and analysis of very large and
fast-growing amounts of research publications stored
across Open Access Repositories (OAR).




                           45/52
DiggiCORE networks




Three networks: (a) semantically related papers,
(b) citation network, (c) author citation network


                          46/52
DiggiCORE objectives

Allow researchers to use this platform to analyse
publications.
Why?
•   To identifying patterns in the behaviour of research
    communities
•   To detect trends in research disciplines
•   To gain new insights into the citation behaviour of researchers
•   To discover features that distinguish papers with high impact



                               47/52
Questions the system can help answering?
•   What are the attributes of impact publications?
•   Do these attributes differ in the humanities, social sciences and
    computer sciences?
•    What are the features of research groups within disciplines and
    how do these features relate to contributions generated by the
    group?
•   What are the attributes of high-impact authors and what is their
    role within the group?
•    What are the dynamics of successful research groups?



                                48/52
Questions the system can help answering?
•   What is the mechanism of cross-fertilisation within
    disciplines, especially between the humanities and the
    sciences?
•   Who are the authors whose work is worth monitoring because
    they contribute to the achievements of their own discipline and
    also inspire other disciplines?
•   How should the novice in the discipline get acquainted with key
    achievements in the discipline?
•    How should he/she search for the most important publications?



                               49/52
Summary
•   The rapid growth of OA content provides both an opportunity as
    well as a challenge.
•   Aggregations should serve the needs of different user groups.
•   Aggregations need to aggregate content, not just metadata.
•   We can have many services that are part of the
    infrastructure, but should work with the same data.




                               50/52
Thank you!




Yes we can!
   51/52
52/52

More Related Content

Similar to CORE: Aggregating and Enriching Content to Support Open Access

Open Archives Initiatives For Metadata Harvesting
Open Archives Initiatives For Metadata   HarvestingOpen Archives Initiatives For Metadata   Harvesting
Open Archives Initiatives For Metadata HarvestingNikesh Narayanan
 
Data repositories -- Xiamen University 2012 06-08
Data repositories -- Xiamen University 2012 06-08Data repositories -- Xiamen University 2012 06-08
Data repositories -- Xiamen University 2012 06-08Jian Qin
 
Organic.Edunet Repository Tools
Organic.Edunet Repository ToolsOrganic.Edunet Repository Tools
Organic.Edunet Repository ToolsHannes Ebner
 
OLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSEOLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSEZalpa Rathod
 
OLAP & Data Warehouse
OLAP & Data WarehouseOLAP & Data Warehouse
OLAP & Data WarehouseZalpa Rathod
 
Enterprise linked data clouds
Enterprise linked data cloudsEnterprise linked data clouds
Enterprise linked data cloudsdamienjoyce
 
Contributing to the Smart City Through Linked Library Data
Contributing to the Smart City Through Linked Library DataContributing to the Smart City Through Linked Library Data
Contributing to the Smart City Through Linked Library DataMarcia Zeng
 
Text mining in CORE (OR2012)
Text mining in CORE (OR2012)Text mining in CORE (OR2012)
Text mining in CORE (OR2012)petrknoth
 
Net flowhadoop flocon2013_yhlee_final
Net flowhadoop flocon2013_yhlee_finalNet flowhadoop flocon2013_yhlee_final
Net flowhadoop flocon2013_yhlee_finalYeounhee Lee
 
ESI Supplemental Webinar 2 - DataONE presentation slides
ESI Supplemental Webinar 2 - DataONE presentation slides ESI Supplemental Webinar 2 - DataONE presentation slides
ESI Supplemental Webinar 2 - DataONE presentation slides DuraSpace
 
CETIS09 OER Technical Roundtable
CETIS09 OER Technical Roundtable  CETIS09 OER Technical Roundtable
CETIS09 OER Technical Roundtable R. John Robertson
 
Data Mining: Data mining and key definitions
Data Mining: Data mining and key definitionsData Mining: Data mining and key definitions
Data Mining: Data mining and key definitionsDatamining Tools
 
Putting it all together for digital assets
Putting it all together for digital assetsPutting it all together for digital assets
Putting it all together for digital assetsJon Morley
 
The ARIADNE interoperability framework, component architecture and registry s...
The ARIADNE interoperability framework, component architecture and registry s...The ARIADNE interoperability framework, component architecture and registry s...
The ARIADNE interoperability framework, component architecture and registry s...ariadnenetwork
 
Building a Data Discovery Network for Sustainability Science
Building a Data Discovery Network for Sustainability ScienceBuilding a Data Discovery Network for Sustainability Science
Building a Data Discovery Network for Sustainability ScienceRobert H. McDonald
 
Real-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiReal-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiManish Gupta
 

Similar to CORE: Aggregating and Enriching Content to Support Open Access (20)

Open Archives Initiatives For Metadata Harvesting
Open Archives Initiatives For Metadata   HarvestingOpen Archives Initiatives For Metadata   Harvesting
Open Archives Initiatives For Metadata Harvesting
 
Metasearchers Benchmarking
Metasearchers BenchmarkingMetasearchers Benchmarking
Metasearchers Benchmarking
 
Data repositories -- Xiamen University 2012 06-08
Data repositories -- Xiamen University 2012 06-08Data repositories -- Xiamen University 2012 06-08
Data repositories -- Xiamen University 2012 06-08
 
Organic.Edunet Repository Tools
Organic.Edunet Repository ToolsOrganic.Edunet Repository Tools
Organic.Edunet Repository Tools
 
Digitisation and institutional repositories 3
Digitisation and institutional repositories 3Digitisation and institutional repositories 3
Digitisation and institutional repositories 3
 
OLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSEOLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSE
 
OLAP & Data Warehouse
OLAP & Data WarehouseOLAP & Data Warehouse
OLAP & Data Warehouse
 
Enterprise linked data clouds
Enterprise linked data cloudsEnterprise linked data clouds
Enterprise linked data clouds
 
Contributing to the Smart City Through Linked Library Data
Contributing to the Smart City Through Linked Library DataContributing to the Smart City Through Linked Library Data
Contributing to the Smart City Through Linked Library Data
 
OAI and OAI-PMH
OAI and OAI-PMHOAI and OAI-PMH
OAI and OAI-PMH
 
Text mining in CORE (OR2012)
Text mining in CORE (OR2012)Text mining in CORE (OR2012)
Text mining in CORE (OR2012)
 
Net flowhadoop flocon2013_yhlee_final
Net flowhadoop flocon2013_yhlee_finalNet flowhadoop flocon2013_yhlee_final
Net flowhadoop flocon2013_yhlee_final
 
ESI Supplemental Webinar 2 - DataONE presentation slides
ESI Supplemental Webinar 2 - DataONE presentation slides ESI Supplemental Webinar 2 - DataONE presentation slides
ESI Supplemental Webinar 2 - DataONE presentation slides
 
CETIS09 OER Technical Roundtable
CETIS09 OER Technical Roundtable  CETIS09 OER Technical Roundtable
CETIS09 OER Technical Roundtable
 
Data Mining: Data mining and key definitions
Data Mining: Data mining and key definitionsData Mining: Data mining and key definitions
Data Mining: Data mining and key definitions
 
Data Mining: Key definitions
Data Mining: Key definitionsData Mining: Key definitions
Data Mining: Key definitions
 
Putting it all together for digital assets
Putting it all together for digital assetsPutting it all together for digital assets
Putting it all together for digital assets
 
The ARIADNE interoperability framework, component architecture and registry s...
The ARIADNE interoperability framework, component architecture and registry s...The ARIADNE interoperability framework, component architecture and registry s...
The ARIADNE interoperability framework, component architecture and registry s...
 
Building a Data Discovery Network for Sustainability Science
Building a Data Discovery Network for Sustainability ScienceBuilding a Data Discovery Network for Sustainability Science
Building a Data Discovery Network for Sustainability Science
 
Real-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFiReal-Time Data Flows with Apache NiFi
Real-Time Data Flows with Apache NiFi
 

More from petrknoth

Qui Bono? Cumulative advantage in open access publishing
Qui Bono? Cumulative advantage in open access publishingQui Bono? Cumulative advantage in open access publishing
Qui Bono? Cumulative advantage in open access publishingpetrknoth
 
OAI Identifiers: Decentralised PIDs for Research Outputs in Repositories
OAI Identifiers: Decentralised PIDs for Research Outputs in RepositoriesOAI Identifiers: Decentralised PIDs for Research Outputs in Repositories
OAI Identifiers: Decentralised PIDs for Research Outputs in Repositoriespetrknoth
 
UKRI OA policy requirements for repositories and how to meet them
UKRI OA policy requirements for repositories and how to meet themUKRI OA policy requirements for repositories and how to meet them
UKRI OA policy requirements for repositories and how to meet thempetrknoth
 
Enabling Educators to Locate High-Quality Teaching Resources
Enabling Educators to LocateHigh-Quality Teaching ResourcesEnabling Educators to LocateHigh-Quality Teaching Resources
Enabling Educators to Locate High-Quality Teaching Resourcespetrknoth
 
Tracking compliance of the REF2021 policy with the CORE Repository Dashboard
Tracking compliance of the REF2021 policy with the CORE Repository DashboardTracking compliance of the REF2021 policy with the CORE Repository Dashboard
Tracking compliance of the REF2021 policy with the CORE Repository Dashboardpetrknoth
 
Better together: building services for public good on top of content from the...
Better together: building services for public good on top of content from the...Better together: building services for public good on top of content from the...
Better together: building services for public good on top of content from the...petrknoth
 
CORE Analytics Dashboard
CORE Analytics DashboardCORE Analytics Dashboard
CORE Analytics Dashboardpetrknoth
 
Better together: building services for public good on top of content from the...
Better together: building services for public good on top of content from the...Better together: building services for public good on top of content from the...
Better together: building services for public good on top of content from the...petrknoth
 
Analysing the performance of open access papers discovery tools
Analysing the performance of open access papers discovery toolsAnalysing the performance of open access papers discovery tools
Analysing the performance of open access papers discovery toolspetrknoth
 
Assessing Compliance with the UK REF 2021 Open Access Policy
Assessing Compliance with the UK REF 2021 Open Access PolicyAssessing Compliance with the UK REF 2021 Open Access Policy
Assessing Compliance with the UK REF 2021 Open Access Policypetrknoth
 
Data interoperability toolkit (OpenMinTeD)
Data interoperability toolkit (OpenMinTeD)Data interoperability toolkit (OpenMinTeD)
Data interoperability toolkit (OpenMinTeD)petrknoth
 
Integrating research indicators for use in the repositories infrastructure
Integrating research indicators for use in the repositories infrastructure Integrating research indicators for use in the repositories infrastructure
Integrating research indicators for use in the repositories infrastructure petrknoth
 
Towards effective research recommender systems for repositories
Towards effective research recommender systems for repositoriesTowards effective research recommender systems for repositories
Towards effective research recommender systems for repositoriespetrknoth
 
COAR Next Generation Repositories WG - Text mining and Recommender system sto...
COAR Next Generation Repositories WG - Text mining and Recommender system sto...COAR Next Generation Repositories WG - Text mining and Recommender system sto...
COAR Next Generation Repositories WG - Text mining and Recommender system sto...petrknoth
 
Seamless access to the world’s open access research papers via ResourceSync
Seamless access to the world’s open access research papers via ResourceSyncSeamless access to the world’s open access research papers via ResourceSync
Seamless access to the world’s open access research papers via ResourceSyncpetrknoth
 
Semantometrics: Towards Fulltext-based Research Evaluation
Semantometrics: Towards Fulltext-based Research EvaluationSemantometrics: Towards Fulltext-based Research Evaluation
Semantometrics: Towards Fulltext-based Research Evaluationpetrknoth
 
Aggregating Research papers from Publishers' Systems to Support Text and Data...
Aggregating Research papers from Publishers' Systems to Support Text and Data...Aggregating Research papers from Publishers' Systems to Support Text and Data...
Aggregating Research papers from Publishers' Systems to Support Text and Data...petrknoth
 
My repository is being aggregated: a blessing or a curse?
My repository is being aggregated: a blessing or a curse?My repository is being aggregated: a blessing or a curse?
My repository is being aggregated: a blessing or a curse?petrknoth
 
FOSTER - Content Delivery (WP3)
FOSTER - Content Delivery (WP3)FOSTER - Content Delivery (WP3)
FOSTER - Content Delivery (WP3)petrknoth
 

More from petrknoth (20)

Qui Bono? Cumulative advantage in open access publishing
Qui Bono? Cumulative advantage in open access publishingQui Bono? Cumulative advantage in open access publishing
Qui Bono? Cumulative advantage in open access publishing
 
CORE APIv3
CORE APIv3CORE APIv3
CORE APIv3
 
OAI Identifiers: Decentralised PIDs for Research Outputs in Repositories
OAI Identifiers: Decentralised PIDs for Research Outputs in RepositoriesOAI Identifiers: Decentralised PIDs for Research Outputs in Repositories
OAI Identifiers: Decentralised PIDs for Research Outputs in Repositories
 
UKRI OA policy requirements for repositories and how to meet them
UKRI OA policy requirements for repositories and how to meet themUKRI OA policy requirements for repositories and how to meet them
UKRI OA policy requirements for repositories and how to meet them
 
Enabling Educators to Locate High-Quality Teaching Resources
Enabling Educators to LocateHigh-Quality Teaching ResourcesEnabling Educators to LocateHigh-Quality Teaching Resources
Enabling Educators to Locate High-Quality Teaching Resources
 
Tracking compliance of the REF2021 policy with the CORE Repository Dashboard
Tracking compliance of the REF2021 policy with the CORE Repository DashboardTracking compliance of the REF2021 policy with the CORE Repository Dashboard
Tracking compliance of the REF2021 policy with the CORE Repository Dashboard
 
Better together: building services for public good on top of content from the...
Better together: building services for public good on top of content from the...Better together: building services for public good on top of content from the...
Better together: building services for public good on top of content from the...
 
CORE Analytics Dashboard
CORE Analytics DashboardCORE Analytics Dashboard
CORE Analytics Dashboard
 
Better together: building services for public good on top of content from the...
Better together: building services for public good on top of content from the...Better together: building services for public good on top of content from the...
Better together: building services for public good on top of content from the...
 
Analysing the performance of open access papers discovery tools
Analysing the performance of open access papers discovery toolsAnalysing the performance of open access papers discovery tools
Analysing the performance of open access papers discovery tools
 
Assessing Compliance with the UK REF 2021 Open Access Policy
Assessing Compliance with the UK REF 2021 Open Access PolicyAssessing Compliance with the UK REF 2021 Open Access Policy
Assessing Compliance with the UK REF 2021 Open Access Policy
 
Data interoperability toolkit (OpenMinTeD)
Data interoperability toolkit (OpenMinTeD)Data interoperability toolkit (OpenMinTeD)
Data interoperability toolkit (OpenMinTeD)
 
Integrating research indicators for use in the repositories infrastructure
Integrating research indicators for use in the repositories infrastructure Integrating research indicators for use in the repositories infrastructure
Integrating research indicators for use in the repositories infrastructure
 
Towards effective research recommender systems for repositories
Towards effective research recommender systems for repositoriesTowards effective research recommender systems for repositories
Towards effective research recommender systems for repositories
 
COAR Next Generation Repositories WG - Text mining and Recommender system sto...
COAR Next Generation Repositories WG - Text mining and Recommender system sto...COAR Next Generation Repositories WG - Text mining and Recommender system sto...
COAR Next Generation Repositories WG - Text mining and Recommender system sto...
 
Seamless access to the world’s open access research papers via ResourceSync
Seamless access to the world’s open access research papers via ResourceSyncSeamless access to the world’s open access research papers via ResourceSync
Seamless access to the world’s open access research papers via ResourceSync
 
Semantometrics: Towards Fulltext-based Research Evaluation
Semantometrics: Towards Fulltext-based Research EvaluationSemantometrics: Towards Fulltext-based Research Evaluation
Semantometrics: Towards Fulltext-based Research Evaluation
 
Aggregating Research papers from Publishers' Systems to Support Text and Data...
Aggregating Research papers from Publishers' Systems to Support Text and Data...Aggregating Research papers from Publishers' Systems to Support Text and Data...
Aggregating Research papers from Publishers' Systems to Support Text and Data...
 
My repository is being aggregated: a blessing or a curse?
My repository is being aggregated: a blessing or a curse?My repository is being aggregated: a blessing or a curse?
My repository is being aggregated: a blessing or a curse?
 
FOSTER - Content Delivery (WP3)
FOSTER - Content Delivery (WP3)FOSTER - Content Delivery (WP3)
FOSTER - Content Delivery (WP3)
 

Recently uploaded

MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupJonathanParaisoCruz
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxHistory Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxsocialsciencegdgrohi
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...M56BOOKSTORE PRODUCT/SERVICE
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaVirag Sontakke
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfadityarao40181
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentInMediaRes1
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
CELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxCELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxJiesonDelaCerna
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,Virag Sontakke
 

Recently uploaded (20)

MARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized GroupMARGINALIZATION (Different learners in Marginalized Group
MARGINALIZATION (Different learners in Marginalized Group
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxHistory Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of India
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdf
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media Component
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
CELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxCELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptx
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
 

CORE: Aggregating and Enriching Content to Support Open Access

  • 1. CORE: Aggregating and Enriching Content to Support Open Access Petr Knoth The Open University 1/52
  • 2. Outline 1. Aggregating Open Access (OA) publications – why, how, what for? 2. The CORE system 3. Supporting research in mining databases of scientific publications (DiggiCORE) 2/52
  • 3. Outline 1. Aggregating Open Access (OA) publications – why, how, what for? 2. The CORE system 3. Supporting research in mining databases of scientific publications (DiggiCORE) 3/52
  • 4. Growth of items in Open Access repositories 4/52
  • 5. Growth of Open Access repositories 5/52
  • 6. Growth of articles in OA journals 6/52
  • 7. Growth of OA journals 7/52
  • 8. Green Open Access - statistics 8/52
  • 9. Why we need aggregations? “Each individual repository is of limited value for research: the real power of Open Access lies in the possibility of connecting and tying together repositories, which is why we need interoperability. In order to create a seamless layer of content through connected repositories from around the world, Open Access relies on interoperability, the ability for systems to communicate with each other and pass information back and forth in a usable format. Interoperability allows us to exploit today's computational power so that we can aggregate, data mine, create new tools and services, and generate new knowledge from repository content.’’ [COAR manifesto] 9/52
  • 10. Access to information according to the level of abstraction Metadata Transfer Interoperability Metadata OLTP Analytical Semantic Enrichment Repository information access Interfaces Aggregation Transaction Repository information access Content OLAP Raw data access Repository 10/52
  • 11. Who should be supported by aggregations? The following users groups (divided according to the level of abstraction of information they need): • Raw data access. • Transaction information access. • Analytical information access. 11/52
  • 12. Who should be supported by aggregations? • The following users groups (divided according to the level of abstraction of information they need): • Raw data access. Developers, DLs, DL researchers, companies … • Transaction information access. Researchers, students, life-long learners … • Analytical information access. Funders, government, bussiness intelligence … 12/52
  • 13. Layers of an aggregation system Interfaces OLTP OLAP Enrichment Metadata Content Metadata Transfer Interoperability 13/52
  • 14. Layers of an aggregation system APIs (REST, SOAP, XML-RPC), UIs, Dashboards Statistics Interfaces OLTP OLAP Enrichment Catalog records Metadata Content Metadata Transfer Interoperability Annotations OAI-PMH, OAI-ORE … Dublin Core, XML, RDF … PDF, Word … 14/52
  • 15. Access to information according to the level of abstraction Metadata Transfer Interoperability Metadata OLTP Repository Analytical information access Interfaces Enrichment Transaction Repository information access Content OLAP Raw data access Repository 15/52
  • 17. Aggregation projects – BASE Metadata Transfer Interoperability Metadata OLTP Repository Analytical information access Interfaces Enrichment Transaction Repository information access Content OLAP Raw data access Repository 17/52
  • 18. Aggregation projects – OAISter/WorldCAT Metadata Transfer Interoperability Metadata OLTP Repository Analytical information access Interfaces Enrichment Transaction Repository information access Content OLAP Raw data access Repository 18/52
  • 19. Aggregation projects – RepUK Metadata Transfer Interoperability Metadata OLTP Repository Analytical information access Interfaces Enrichment Transaction Repository information access Content OLAP Raw data access Repository 19/52
  • 20. Aggregations need access to content, not just metadata! • Certain metadata types can be created only at the level of the aggregation • Certain metadata can be changing in time • Ensuring content: • accessibility • availability • validity • quality • … 20/52
  • 21. Aggregation projects – CiteSeerX Metadata Transfer Interoperability Metadata OLTP Repository Analytical information access Interfaces Enrichment Transaction Repository information access Content OLAP Raw data access Repository 21/52
  • 22. Should an aggregation system support all three user types? Can be realised by more than one system providing that the dataset is the same! 22/52
  • 23. Outline 1. Aggregating Open Access (OA) publications – why, how, what for? 2. The CORE system 3. Supporting research in mining databases of scientific publications (DiggiCORE) 23/52
  • 24. CORE objectives • CORE aims to provide a comprehensive technical infrastructure for Open Access scholarly publications that will support access and reuse of scholarly materials at different levels of abstraction. • A nation-wide aggregation system that will improve the discovery of publications stored in British Open Access Repositories (OARs). 24/52
  • 25. What does CORE provide at different aggregation levels? Metadata Transfer Interoperability Metadata OLTP Repository Analytical information access Interfaces Enrichment Transaction Repository information access Content OLAP Raw data access Repository 25/52
  • 27. CORE functionality Step 1: Metadata and full-text harvesting Content harvesting, processing 27/52
  • 28. What does CORE provide at different aggregation levels? Semantic similarity, Citation extraction, classsification, … Metadata Transfer Interoperability Metadata OLTP Repository Analytical information access Interfaces Enrichment Transaction Repository information access Content OLAP Raw data access Repository 28/52
  • 29. CORE functionality Step 2: Semantic enrichment Semantic enrichment 29/52
  • 30. What does CORE provide at different aggregation levels? Metadata Transfer Interoperability Metadata OLTP Repository Analytical information access Interfaces Enrichment Transaction Repository information access Content OLAP Raw data access Repository 30/52
  • 31. CORE functionality Step 3: Providing a set of services on top of the aggregation Providing services 31/52
  • 32. CORE applications • CORE Portal • CORE Mobile • CORE Plugin • CORE API • Repository Analytics 32/52
  • 33. What does CORE provide at different aggregation levels? Metadata Transfer Interoperability Metadata OLTP Repository Analytical information access Interfaces Enrichment Transaction Repository information access Content OLAP Raw data access Repository 33/52
  • 34. CORE Applications CORE Portal – Allows searching and navigating scientific publications aggregated from Open Access repositories 34/52
  • 35. CORE Applications CORE Mobile – Allows searching and navigating scientific publications aggregated from Open Access repositories 35/52
  • 36. CORE Applications CORE Plugin – A plugin to system that recommendations for related items. 36/52
  • 37. What does CORE provide at different aggregation levels? Metadata Transfer Interoperability Metadata OLTP Repository Analytical information access Interfaces Enrichment Transaction Repository information access Content OLAP Raw data access Repository 37/52
  • 38. CORE Applications CORE API – Enables external systems and services to interact with the CORE repository. 38/52
  • 39. What does CORE provide at different aggregation levels? Metadata Transfer Interoperability Metadata OLTP Repository Analytical information access Interfaces Enrichment Transaction Repository information access Content OLAP Raw data access Repository 39/52
  • 40. CORE Applications Repository Analytics – is an analytical tool supporting providers of open access content (in particular repository managers). 40/52
  • 41. What does CORE provide at different aggregation levels? Repository Analytics Metadata Transfer Interoperability Metadata OLTP Repository Analytical information access Interfaces Enrichment CORE Portal, CORE Mobile, CORE Plugin Transaction Repository information access Content OLAP CORE API Raw data access Repository 41/52
  • 42. CORE statistics • Content • 5.4M records • 192 repositories • 402k full-texts • Started: February 2011 • Budget: 140k£ 42/52
  • 43. Outline 1. Aggregating Open Access (OA) publications – why, how, what for? 2. The CORE system 3. Supporting research in mining databases of scientific publications ( ) 43/52
  • 45. Objective Software for exploration and analysis of very large and fast-growing amounts of research publications stored across Open Access Repositories (OAR). 45/52
  • 46. DiggiCORE networks Three networks: (a) semantically related papers, (b) citation network, (c) author citation network 46/52
  • 47. DiggiCORE objectives Allow researchers to use this platform to analyse publications. Why? • To identifying patterns in the behaviour of research communities • To detect trends in research disciplines • To gain new insights into the citation behaviour of researchers • To discover features that distinguish papers with high impact 47/52
  • 48. Questions the system can help answering? • What are the attributes of impact publications? • Do these attributes differ in the humanities, social sciences and computer sciences? • What are the features of research groups within disciplines and how do these features relate to contributions generated by the group? • What are the attributes of high-impact authors and what is their role within the group? • What are the dynamics of successful research groups? 48/52
  • 49. Questions the system can help answering? • What is the mechanism of cross-fertilisation within disciplines, especially between the humanities and the sciences? • Who are the authors whose work is worth monitoring because they contribute to the achievements of their own discipline and also inspire other disciplines? • How should the novice in the discipline get acquainted with key achievements in the discipline? • How should he/she search for the most important publications? 49/52
  • 50. Summary • The rapid growth of OA content provides both an opportunity as well as a challenge. • Aggregations should serve the needs of different user groups. • Aggregations need to aggregate content, not just metadata. • We can have many services that are part of the infrastructure, but should work with the same data. 50/52
  • 51. Thank you! Yes we can! 51/52
  • 52. 52/52