Data Archiving and Networked Services




Genericity versus expressivity –
reflections about the semantics
    of interoperable research
       information systems

 Andrea Scharnhorst, Frank van der Most, Christophe Gueret,
    Tamy Chambers (IU, Bloomington), Linda Reijnhoudt

     Presentation at the ACUMEN workshop, March 8, 2013,
                          Copenhagen


DANS is an institute of KNAW and NWO
Andrea Scharnhorst – “science located”




   •Head of eResearch at DANS and scientific coordinator of the Computational Humanities
   programme at the eHumanities group of the Royal Netherlands Academy of Arts and
   Sciences (KNAW) – DANS=Data Archiving and Networked Services Institute (DANS)
ElectronicArchivingSYstem and NARCIS –
 Core services (‘products’) of DANS
                              www.easy.dans.knaw.nl


DANS as non-proprietary information provider
 contributes to transparence and accessibility
            public funded research

   www.narcis.nl
Overview
•How this research started?

•Bibliometrics, research information systems and Linked Open
Data

•The need for core vocabulary

•Our proposal

•Outlook
What is a “bijzondere hoogleraar”?
Overview
•How this research started?

•Bibliometrics, research information systems and Linked Open
Data

•The need for core vocabulary

•Our proposal

•Outlook
Quantitative studies of science
       - scientometrics, bibliometrics, informetrics

                            Persons – Organizations - Projects



    Input                                                          Output
                            Processes of knowledge creation


Number of scientists                                               Number of publications
Number of PhD students                                             /citations
R&D expenditure                                                    Number of PhD students
Instruments                                                        Number of patents
…..                                                                …..
Education                               Libraries
                                         Libraries                 Books
Books                                                              Journals
Data                     Information    Archives
                                         Archives    Information   Data
Information resources
                         provision                   storage
What is a Research Information System?




Ref: KG Jeffery 2008 History of CRIS http://www.eurocris.org/Uploads/Web
%20pages/historyCRIS/3HistoryofCRIS.ppt
See also: Nick Sheppard. "Learning How to Play Nicely: Repositories and CRIS". July 2010, Ariadne Issue 64
http://www.ariadne.ac.uk/issue64/wrn-repos-2010-05-rpt/
What do we want ?
•   The dream: single, user-curated, consistent and
    up to date source that knows everything about
    someone


•   Many aiming at being the one
The Dutch situation – many players
                             Metis is very detailed, fed by admin (universities, KNAW)
                             but has a limited on-line interface.




                  CWTS is the ‘Scientific Observatory’ in NL for
                  Research Evaluation; but the databases are
                  not public.




             NARCIS is the main national portal for those looking for information
             about researchers and their work.
Courtesy of Nick Veenstra TU/e See: http://ehumanities.nl/vivo-symposium-january-18-2013/
VIVO




http://nrn.cns.iu.edu/
Katy Borner, Mike Conlon, Jon Corson-Rikert, Ying Ding (eds). 2012.
VIVO: A Semantic Approach to Scholarly Networking and Discovery, Morgan & Claypool Publishers.
Overview
•How this research started?

•Bibliometrics, research information systems and Linked Open
Data

•The need for core vocabulary

•Our proposal

•Outlook
Different concepts – different data representations




 Data/Software …
Different concepts – different data representations
Result: Data does not travel well...
•   Publications from Frank van Harmelen
•   Decreasing number from system to system




     148                  38                  13
Web of Science has 43 publications
and Google 283 !
Why is information lost ?
•   Incentives
    o Keeping one data source up to date is costly
    o Keeping several is even more so!


•   Standards
    o Information that can not be expressed is lost


•   Confusion
    o Re-invent the well & (partially) duplicate

       information
Overview
•How this research started?

•Bibliometrics, research information systems and Linked Open
Data

•The need for core vocabulary

•Our proposal

•Outlook
Conceptual model of the core ontology
Core vocabulary as ‘middle-ware’




                         US


       NL
Scope                                                           Position
                International


                                                                Hoogleraar
                      National
                      National

                         Institutional
                          Institutional                          Academisch
                                                                 Hoogleraar
                                Individual
                                 Individual
 The core vocabulary proposal crosses the usual multi-purpose           Expressivity
 ontologies at different scales of scope. It has one purpose
 (the presentation of a researchers career). It does not translate 1:1;
 but defines shells of meaning (facets).
 Up-scaled (higher scope) it losses expressivity;
 down-scaled it gains expressivity in turn for lesser interoperability
 of the fine-grained information.
Overview
•How this research started?

•Bibliometrics, research information systems and Linked Open
Data

•The need for core vocabulary

•Our proposal

•Outlook
Just another ontology?

There is no escape of machine-readable information
exchange and processing.

There is a limit at user-provided content.

There are institutional and national interests.

There is a tension of locally cared information and the
globalization of science.

Might not be ‘our’ system, but a system will come!
Discourse about this is needed!
VISION I: All research information
[change](P. Doorn)                   26
Börner K, Klavans R, Patek M, Zoss AM, et al. (2012) Design and Update of a Classification System: The UCSD Map of Science. PLoS ONE 7(7):
e39464. doi:10.1371/journal.pone.0039464 http://www.plosone.org/article/info:doi/10.1371/journal.pone.0039464




 VISION II: A way for researcher to presented themselves
 (taylored) extracted from a research information ecosystem                                                                       27
eResearch DANS
                                                                                                                                     Katy Börner
                                                                                                                                     Indiana University
                             Frank van der Most          Dirk Roorda                                        Linda Reijnhoudt         Visiting fellow DANS-KNAW
Rene van Horik                                                                                              NARCIS, Visualizations
Sustainability and       Scientific careers and cultures Queries as annotations,
permanence, multi-media of data sharing, ACUMEN          CLARIN, Circulation of
sources, APARSEN, NEDIMAH                                knowledge




 Peter Doorn                 Christophe Guéret Albert Moroño Peñuela               Ashkan Askpour
                             Semantic web,        Semantic web, CEDAR              History, information sciences,
 eHistory, Clarin, Dariah,
                             complex networks                                      IISH
 Clariah
                             CEDAR,                                                CEDAR
 Director of DANS
                             PI: WikiReg




  Leen Breure                Cristian Dinu          Marat Charlaganov
  Enhanced publications,     WikiReg                WikiReg
  eHistory

                                                                    Thank you for your attention!
                                                              For more information please contact
                                                                Andrea.scharnhorst@dans.knaw.nl

Genericity versus expressivity – reflections about the semantics of interoperable research information systems

  • 1.
    Data Archiving andNetworked Services Genericity versus expressivity – reflections about the semantics of interoperable research information systems Andrea Scharnhorst, Frank van der Most, Christophe Gueret, Tamy Chambers (IU, Bloomington), Linda Reijnhoudt Presentation at the ACUMEN workshop, March 8, 2013, Copenhagen DANS is an institute of KNAW and NWO
  • 2.
    Andrea Scharnhorst –“science located” •Head of eResearch at DANS and scientific coordinator of the Computational Humanities programme at the eHumanities group of the Royal Netherlands Academy of Arts and Sciences (KNAW) – DANS=Data Archiving and Networked Services Institute (DANS)
  • 3.
    ElectronicArchivingSYstem and NARCIS– Core services (‘products’) of DANS www.easy.dans.knaw.nl DANS as non-proprietary information provider contributes to transparence and accessibility public funded research www.narcis.nl
  • 4.
    Overview •How this researchstarted? •Bibliometrics, research information systems and Linked Open Data •The need for core vocabulary •Our proposal •Outlook
  • 5.
    What is a“bijzondere hoogleraar”?
  • 6.
    Overview •How this researchstarted? •Bibliometrics, research information systems and Linked Open Data •The need for core vocabulary •Our proposal •Outlook
  • 7.
    Quantitative studies ofscience - scientometrics, bibliometrics, informetrics Persons – Organizations - Projects Input Output Processes of knowledge creation Number of scientists Number of publications Number of PhD students /citations R&D expenditure Number of PhD students Instruments Number of patents ….. ….. Education Libraries Libraries Books Books Journals Data Information Archives Archives Information Data Information resources provision storage
  • 8.
    What is aResearch Information System? Ref: KG Jeffery 2008 History of CRIS http://www.eurocris.org/Uploads/Web %20pages/historyCRIS/3HistoryofCRIS.ppt See also: Nick Sheppard. "Learning How to Play Nicely: Repositories and CRIS". July 2010, Ariadne Issue 64 http://www.ariadne.ac.uk/issue64/wrn-repos-2010-05-rpt/
  • 9.
    What do wewant ? • The dream: single, user-curated, consistent and up to date source that knows everything about someone • Many aiming at being the one
  • 10.
    The Dutch situation– many players Metis is very detailed, fed by admin (universities, KNAW) but has a limited on-line interface. CWTS is the ‘Scientific Observatory’ in NL for Research Evaluation; but the databases are not public. NARCIS is the main national portal for those looking for information about researchers and their work.
  • 11.
    Courtesy of NickVeenstra TU/e See: http://ehumanities.nl/vivo-symposium-january-18-2013/
  • 12.
    VIVO http://nrn.cns.iu.edu/ Katy Borner, MikeConlon, Jon Corson-Rikert, Ying Ding (eds). 2012. VIVO: A Semantic Approach to Scholarly Networking and Discovery, Morgan & Claypool Publishers.
  • 13.
    Overview •How this researchstarted? •Bibliometrics, research information systems and Linked Open Data •The need for core vocabulary •Our proposal •Outlook
  • 14.
    Different concepts –different data representations Data/Software …
  • 15.
    Different concepts –different data representations
  • 16.
    Result: Data doesnot travel well... • Publications from Frank van Harmelen • Decreasing number from system to system 148 38 13
  • 17.
    Web of Sciencehas 43 publications
  • 18.
  • 19.
    Why is informationlost ? • Incentives o Keeping one data source up to date is costly o Keeping several is even more so! • Standards o Information that can not be expressed is lost • Confusion o Re-invent the well & (partially) duplicate information
  • 20.
    Overview •How this researchstarted? •Bibliometrics, research information systems and Linked Open Data •The need for core vocabulary •Our proposal •Outlook
  • 21.
    Conceptual model ofthe core ontology
  • 22.
    Core vocabulary as‘middle-ware’ US NL
  • 23.
    Scope Position International Hoogleraar National National Institutional Institutional Academisch Hoogleraar Individual Individual The core vocabulary proposal crosses the usual multi-purpose Expressivity ontologies at different scales of scope. It has one purpose (the presentation of a researchers career). It does not translate 1:1; but defines shells of meaning (facets). Up-scaled (higher scope) it losses expressivity; down-scaled it gains expressivity in turn for lesser interoperability of the fine-grained information.
  • 24.
    Overview •How this researchstarted? •Bibliometrics, research information systems and Linked Open Data •The need for core vocabulary •Our proposal •Outlook
  • 25.
    Just another ontology? Thereis no escape of machine-readable information exchange and processing. There is a limit at user-provided content. There are institutional and national interests. There is a tension of locally cared information and the globalization of science. Might not be ‘our’ system, but a system will come! Discourse about this is needed!
  • 26.
    VISION I: Allresearch information [change](P. Doorn) 26
  • 27.
    Börner K, KlavansR, Patek M, Zoss AM, et al. (2012) Design and Update of a Classification System: The UCSD Map of Science. PLoS ONE 7(7): e39464. doi:10.1371/journal.pone.0039464 http://www.plosone.org/article/info:doi/10.1371/journal.pone.0039464 VISION II: A way for researcher to presented themselves (taylored) extracted from a research information ecosystem 27
  • 28.
    eResearch DANS Katy Börner Indiana University Frank van der Most Dirk Roorda Linda Reijnhoudt Visiting fellow DANS-KNAW Rene van Horik NARCIS, Visualizations Sustainability and Scientific careers and cultures Queries as annotations, permanence, multi-media of data sharing, ACUMEN CLARIN, Circulation of sources, APARSEN, NEDIMAH knowledge Peter Doorn Christophe Guéret Albert Moroño Peñuela Ashkan Askpour Semantic web, Semantic web, CEDAR History, information sciences, eHistory, Clarin, Dariah, complex networks IISH Clariah CEDAR, CEDAR Director of DANS PI: WikiReg Leen Breure Cristian Dinu Marat Charlaganov Enhanced publications, WikiReg WikiReg eHistory Thank you for your attention! For more information please contact Andrea.scharnhorst@dans.knaw.nl

Editor's Notes