SlideShare a Scribd company logo
Integration of research literature and data
(InFoLiS)
Katarina Boland1
Philipp Zumstein2
1
GESIS - Leibniz Institute for the Social Sciences, Cologne, Germany
2
Mannheim University Library, Mannheim, Germany
CNI 2015 Spring Membership Meeting
April 14th, 2015
the InFoLiS project:
Integration of research data and publications
InFoLiS I: 05/2011 - 05/2013
InFoLiS II: 08/2014 - 08/2016
InFoLiS is funded by the DFG (SU 647/2-1)
Integration of research literature and data (InFoLiS) 2/22
Introduction
Catalogue:
Publications
SSOAR (GESIS),
Primo (UB MA),
...
DataCatalogue:
Research Data
da|ra (GESIS),
...
Query
Query
Response
Links
Response
Response
Response
Integration of research literature and data (InFoLiS) 3/22
InFoLiS Project Goals
1 Part I: Generation of Links
2 Part II: How can you reuse it?
Integration of research literature and data (InFoLiS) 4/22
Outline
Integration of research literature and data (InFoLiS) 5/22
Part 1: Generation of Links
Recommendation:1
:
Creator (Publication Date): Title. Publication
Agent. Identifier
Creator (Publication Date): Title. Version.
Publication Agent. Type of Resource. Identifier.
→ Extraction based on these patterns?
1
see
http://auffinden-zitieren-dokumentieren.de/zitieren/empfohlene-datenzitation/
Integration of research literature and data (InFoLiS) 6/22
Citation of Research Data
presentation and discussion of the empirical findings. For this purpose, data
from the Socio-Economic Panel (SOEP) of the years 1990 and 2003 are used
and for both periods, the impact factors are estimated using linear regression
models.
data from the title of the years year are used
Integration of research literature and data (InFoLiS) 7/22
References to Datasets
Table 1: Population forecast for Germany depending on age cohorts -
proportion in percent.
Data base: 10th Population Forecast of the Federal Statistical Office , variant 5.
(Data base: number title of the publication agent, variant
variant)
Integration of research literature and data (InFoLiS) 8/22
References to Datasets
Consulted were furthermore ...
Consulted were furthermore title1, title2, title3, ..., titleN.
Integration of research literature and data (InFoLiS) 9/22
References to Datasets
Table 3: Sample of the surveys conducted in the years 2003 and 2004 as well
as size of the sample, with valid data from both surveys
(Source: Ditton et al. 2005a)
(Source: citation of descriptive publication)
Integration of research literature and data (InFoLiS) 10/22
References to Datasets
...are hard to detect!
see also...
Green, Toby (2009). We Need Publishing Standards for
Datasets and Data Tables. OECD Publishing White Paper.
doi: 10.1787/603233448430
Altman, Micah and Gary King (2007). A Proposed Standard
for the Scholarly Citation of Quantitative Data. In: D-Lib
Magazine 13.3.
url: http://www.dlib.org/dlib/march07/altman/03altman.html
Integration of research literature and data (InFoLiS) 11/22
References to Datasets
Integration of research literature and data (InFoLiS) 12/22
Automatic Identification of
References
Why not simply search for study titles in publications?
Integration of research literature and data (InFoLiS) 12/22
Automatic Identification of
References
Why not simply search for study titles in publications?
“ALLBUS/GGSS 1996 (Allgemeine Bev¨olkerungsumfrage der
Sozialwissenschaften/German General Social Survey 1996)”
Integration of research literature and data (InFoLiS) 12/22
Automatic Identification of
References
Why not simply search for study titles in publications?
“ALLBUS/GGSS 1996 (Allgemeine Bev¨olkerungsumfrage
der Sozialwissenschaften/German General Social Survey 1996)”
“ALLBUS 96”
Integration of research literature and data (InFoLiS) 12/22
Automatic Identification of
References
Why not simply search for study titles in publications?
“Youth 2010”
How do humans recognize study references?
Source: Estimations based on SOEP, wave 2002.
Integration of research literature and data (InFoLiS) 13/22
General idea
How do humans recognize study references?
Source: Estimations based on xyz, wave 2002.
Integration of research literature and data (InFoLiS) 13/22
General idea
Integration of research literature and data (InFoLiS) 14/22
Algorithm
for details see...
Katarina Boland, Dominique Ritze, Kai Eckert & Brigitte Mathiak (2012).
Identifying References to Datasets in Publications. In: Proceedings of the
Second International Conference on Theory and Practice of Digital Libraries
(TPDL), Lecture Notes in Computer Science Volume 7489, pp. 150-161. Berlin:
Springer. doi:10.1007/978-3-642-33290-6 17
Integration of research literature and data (InFoLiS) 15/22
Reference Extraction
Integration of research literature and data (InFoLiS) 16/22
Mapping to Datasets in da|ra
Strategies: 1) greedy; 2) exact; 3) best
Integration of research literature and data (InFoLiS) 17/22
Mapping to Datasets in da|ra:
granularity of registration vs. citation
ALLBUS
ALLBUS 2000 ALLBUS 1996ALLBUS 1998
ALLBUS 2000
CAPI/PAPI
ALLBUScompact 2000
CAPI/PAPI
ALLBUScompact 2000
CAPI
ALLBUS - Cumulation 1980-2006 ALLBUS - Cumulation 1980-2008ALLBUScompact - Cumulation 1980-2010
ALLBUScompact 2000 ... ... ...
......
... ... ... ... ...
... ... ... ... ... ... ... ... ... ... ... ......
ALLBUScompact
→ use ontology
Integration of research literature and data (InFoLiS) 18/22
Mapping to Datasets in da|ra
Vocabulary: e.g. DDI-RDF Discovery Vocabulary2
2
Thomas Bosch, Richard Cyganiak, Arofan Gregory, Joachim Wackerow (2013): DDI-RDF Discovery Vocabulary: A Metadata
Vocabulary for Documenting Research and Survey Data. In: Proceedings of the 6th Linked Data on the Web (LDOW) Workshop at
the 22nd International World Wide Web Conference (WWW). CEUR Workshop Proceedings, pp. 46-55
Integration of research literature and data (InFoLiS) 19/22
Ontology: Approach
Integration of research literature and data (InFoLiS) 20/22
Links
Example: da|ra
Example: SSOAR
Integration of research literature and data (InFoLiS) 21/22
Integration of Links into Information
Systems
Thank you for your attention!
katarina.boland@gesis.org
Integration of research literature and data (InFoLiS) 22/22
Next part: How can you
reuse it?
Part II
How can you reuse it?
!
Work in Progress
Interna, Data Structure, Technology
(Internal) Data structure
Document
Pattern
Executation of
Algorithm
Study Title
Study URI
(Internal) Data structure
Document
Pattern
Executation of
Algorithm
Study Title
Study URI
Which studies
are found in a
document?
(Internal) Data structure
Document
Pattern
Executation of
Algorithm
Study Title
Study URI
How was a
pattern derived?
Which studies
are found in an
document?
(Internal) Data structure
Document
Pattern
Executation of
Algorithm
Study Title
Study URI
Which other study
titles are found with
the new
configuration of the
algorithm?
How was a
pattern derived?
Which studies
are found in an
document?
Technology stack
Web Services
RESTful API (web services)
 GET, POST, PUT, DELETE, PATCH resources
 Search, perform algorithms, upload files
 open for integration into other workflows, e.g. in
 ressource discovery systems
 research data catalogues
 digital repositories
 possible to orchestrate over a web interface for
individual use
Lookup services
DB
(links)
lookup service
publication
URI
study URI
study URI
reverse lookup
service
publication
URI
Extraction of study URIs from a PDF
pdf (fulltext)
DB
(patterns)
pdf2txt
txt (fulltext) extract study titles
study URI
study titles
linking
Recognizing patterns
pdfs
(fulltext)
pattern recognizer
seed
DB
(pattern)
Integration of publications and
research data
Quoting the Horizon Report 2014
“Visionary leadership for research data management
models is also required to determine how to best
incorporate data connections into library catalogs” (NMC
Horizon Report 2014 - Library Edition, p. 7)
Current situation: Several steps needed
 Common situation today:
 Search online catalogue
 Evaluate search results
 Find fulltext to relevant source
 Read the publication
 Spot the research data
 Moreover, often the reverse information is missing
completely
 Which publications are built on some specific
research data?
Clientside
load additional data in
catalogue view (e.g. over
Ajax)
 enrich view, links
 up-to-date data
 Embedd data in the web
presentation
Serverside
add additional data in your
catalogue database (e.g.
Primo enrichement process)
 enrich view, links, search,
sort, filter
 time-lagged because of
the update mechanism
 Do the data fit into
existing infrastructure?
(fields, tables, database)
Two Approaches
Integration as links
 Link from catalogue entry ...
 … to the corresponding research data
Integration as popup
Cited research data: 2
• ALLBUS 2010 (used in 512 publications)
• part of ALLBUS (used in 13.456 publications)
• own research data (used in 1 publications)
Integration in search/sort
Cited data sets 4
Cited data sets 1
Sort by data
citation
Integration in search/filter
Research data available
Enrich your research data catalogue
Cited in: Ritze, D., Paulheim, H., &
Eckert, K. (2013). Evaluation Measures
for Ontology Matchers in Supervised
Matching Scenarios. In The Semantic
Web – ISWC 2013 (p. 392–407).
Tags from Publication: Supervised
Ontology Matching, Evaluation, Recall,
Precision, F-Measure, Precision@N-
Curves, ROC-Curves, Precision-Recall-
Curves
Current Goals of the Project
1. Expansion to other disciplines and languages
2. Linked data based infrastructure
3. Improve the reusability of generated links
Dissemination
 our web services will be open for everyone
 project webpage
 http://infolis.github.io/
 background information,
slides, publications, news
 Additionally our code is open source
 https://github.com/infolis
 you can install/try out everything locally
 development of code
Questions, Discussions, Feedback
 Questions?
 Discussions
 Give us feedback
 Small online survey: http://t1p.de/infolis
http://wiki.bib.uni-mannheim.de/limesurvey/index.php?sid=55594

More Related Content

What's hot

RDAP 15 EarthCollab: Connecting Scientific Information Sources using the Sema...
RDAP 15 EarthCollab: Connecting Scientific Information Sources using the Sema...RDAP 15 EarthCollab: Connecting Scientific Information Sources using the Sema...
RDAP 15 EarthCollab: Connecting Scientific Information Sources using the Sema...
ASIS&T
 
NISO Training Thursday Crafting a Scientific Data Management Plan
NISO Training Thursday Crafting a Scientific Data Management PlanNISO Training Thursday Crafting a Scientific Data Management Plan
NISO Training Thursday Crafting a Scientific Data Management Plan
National Information Standards Organization (NISO)
 
Introduction to PANGAEA & EURO-BASIN Data Management, by Janine Felden
Introduction to PANGAEA & EURO-BASIN Data Management, by Janine FeldenIntroduction to PANGAEA & EURO-BASIN Data Management, by Janine Felden
Introduction to PANGAEA & EURO-BASIN Data Management, by Janine Felden
DTU - Technical University of Denmark
 
Repository Fringe 2016 - Survey Documentation and Analysis
Repository Fringe 2016 - Survey Documentation and AnalysisRepository Fringe 2016 - Survey Documentation and Analysis
Repository Fringe 2016 - Survey Documentation and Analysis
EDINA, University of Edinburgh
 
Open Data and Institutional Repositories
Open Data and Institutional RepositoriesOpen Data and Institutional Repositories
Open Data and Institutional Repositories
Robin Rice
 
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
National Information Standards Organization (NISO)
 
DMPTool webinar 2011-10-19
DMPTool webinar 2011-10-19DMPTool webinar 2011-10-19
DMPTool webinar 2011-10-19
University of California Curation Center
 
Edinburgh DataShare: Tackling research data in a DSpace institutional repository
Edinburgh DataShare: Tackling research data in a DSpace institutional repositoryEdinburgh DataShare: Tackling research data in a DSpace institutional repository
Edinburgh DataShare: Tackling research data in a DSpace institutional repository
Robin Rice
 
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
National Information Standards Organization (NISO)
 
Research data life cycle
Research data life cycleResearch data life cycle
Research data life cycle
University of Arizona
 
Baker - Evolution of Data Products and Designated Audiences
Baker - Evolution of Data Products and Designated AudiencesBaker - Evolution of Data Products and Designated Audiences
Baker - Evolution of Data Products and Designated Audiences
National Information Standards Organization (NISO)
 
McGeary Data Curation Network: Developing and Scaling
McGeary Data Curation Network: Developing and ScalingMcGeary Data Curation Network: Developing and Scaling
McGeary Data Curation Network: Developing and Scaling
National Information Standards Organization (NISO)
 
Overcoming obstacles to sharing data about human subjects
Overcoming obstacles to sharing data about human subjectsOvercoming obstacles to sharing data about human subjects
Overcoming obstacles to sharing data about human subjects
Robin Rice
 
An analysis and characterization of DMPs in NSF proposals from the University...
An analysis and characterization of DMPs in NSF proposals from the University...An analysis and characterization of DMPs in NSF proposals from the University...
An analysis and characterization of DMPs in NSF proposals from the University...
Megan O'Donnell
 
RDM for trainee physicians
RDM for trainee physiciansRDM for trainee physicians
RDM for trainee physicians
Historic Environment Scotland
 
Smith - Developing Campus Stakeholders' Collaborations - Sept 8
Smith - Developing Campus Stakeholders' Collaborations - Sept 8Smith - Developing Campus Stakeholders' Collaborations - Sept 8
Smith - Developing Campus Stakeholders' Collaborations - Sept 8
National Information Standards Organization (NISO)
 
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
National Information Standards Organization (NISO)
 
Johnston - How to Curate Research Data
Johnston - How to Curate Research DataJohnston - How to Curate Research Data
Johnston - How to Curate Research Data
National Information Standards Organization (NISO)
 
Data Management Plans: Tips, Tricks and Tools
Data Management Plans: Tips, Tricks and ToolsData Management Plans: Tips, Tricks and Tools
Data Management Plans: Tips, Tricks and Tools
University of California Curation Center
 
DMPTool Webinar 6: Health Sciences and the DMPTool (presented by Lisa Federer)
DMPTool Webinar 6: Health Sciences and the DMPTool (presented by Lisa Federer)DMPTool Webinar 6: Health Sciences and the DMPTool (presented by Lisa Federer)
DMPTool Webinar 6: Health Sciences and the DMPTool (presented by Lisa Federer)
University of California Curation Center
 

What's hot (20)

RDAP 15 EarthCollab: Connecting Scientific Information Sources using the Sema...
RDAP 15 EarthCollab: Connecting Scientific Information Sources using the Sema...RDAP 15 EarthCollab: Connecting Scientific Information Sources using the Sema...
RDAP 15 EarthCollab: Connecting Scientific Information Sources using the Sema...
 
NISO Training Thursday Crafting a Scientific Data Management Plan
NISO Training Thursday Crafting a Scientific Data Management PlanNISO Training Thursday Crafting a Scientific Data Management Plan
NISO Training Thursday Crafting a Scientific Data Management Plan
 
Introduction to PANGAEA & EURO-BASIN Data Management, by Janine Felden
Introduction to PANGAEA & EURO-BASIN Data Management, by Janine FeldenIntroduction to PANGAEA & EURO-BASIN Data Management, by Janine Felden
Introduction to PANGAEA & EURO-BASIN Data Management, by Janine Felden
 
Repository Fringe 2016 - Survey Documentation and Analysis
Repository Fringe 2016 - Survey Documentation and AnalysisRepository Fringe 2016 - Survey Documentation and Analysis
Repository Fringe 2016 - Survey Documentation and Analysis
 
Open Data and Institutional Repositories
Open Data and Institutional RepositoriesOpen Data and Institutional Repositories
Open Data and Institutional Repositories
 
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
 
DMPTool webinar 2011-10-19
DMPTool webinar 2011-10-19DMPTool webinar 2011-10-19
DMPTool webinar 2011-10-19
 
Edinburgh DataShare: Tackling research data in a DSpace institutional repository
Edinburgh DataShare: Tackling research data in a DSpace institutional repositoryEdinburgh DataShare: Tackling research data in a DSpace institutional repository
Edinburgh DataShare: Tackling research data in a DSpace institutional repository
 
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
 
Research data life cycle
Research data life cycleResearch data life cycle
Research data life cycle
 
Baker - Evolution of Data Products and Designated Audiences
Baker - Evolution of Data Products and Designated AudiencesBaker - Evolution of Data Products and Designated Audiences
Baker - Evolution of Data Products and Designated Audiences
 
McGeary Data Curation Network: Developing and Scaling
McGeary Data Curation Network: Developing and ScalingMcGeary Data Curation Network: Developing and Scaling
McGeary Data Curation Network: Developing and Scaling
 
Overcoming obstacles to sharing data about human subjects
Overcoming obstacles to sharing data about human subjectsOvercoming obstacles to sharing data about human subjects
Overcoming obstacles to sharing data about human subjects
 
An analysis and characterization of DMPs in NSF proposals from the University...
An analysis and characterization of DMPs in NSF proposals from the University...An analysis and characterization of DMPs in NSF proposals from the University...
An analysis and characterization of DMPs in NSF proposals from the University...
 
RDM for trainee physicians
RDM for trainee physiciansRDM for trainee physicians
RDM for trainee physicians
 
Smith - Developing Campus Stakeholders' Collaborations - Sept 8
Smith - Developing Campus Stakeholders' Collaborations - Sept 8Smith - Developing Campus Stakeholders' Collaborations - Sept 8
Smith - Developing Campus Stakeholders' Collaborations - Sept 8
 
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
 
Johnston - How to Curate Research Data
Johnston - How to Curate Research DataJohnston - How to Curate Research Data
Johnston - How to Curate Research Data
 
Data Management Plans: Tips, Tricks and Tools
Data Management Plans: Tips, Tricks and ToolsData Management Plans: Tips, Tricks and Tools
Data Management Plans: Tips, Tricks and Tools
 
DMPTool Webinar 6: Health Sciences and the DMPTool (presented by Lisa Federer)
DMPTool Webinar 6: Health Sciences and the DMPTool (presented by Lisa Federer)DMPTool Webinar 6: Health Sciences and the DMPTool (presented by Lisa Federer)
DMPTool Webinar 6: Health Sciences and the DMPTool (presented by Lisa Federer)
 

Viewers also liked

Publishing Ada: A Retrospective Look at the First Three Years of an Open Peer...
Publishing Ada: A Retrospective Look at the First Three Years of an Open Peer...Publishing Ada: A Retrospective Look at the First Three Years of an Open Peer...
Publishing Ada: A Retrospective Look at the First Three Years of an Open Peer...
Karen Estlund
 
Software curation as a digital preservation service
Software curation as a digital preservation serviceSoftware curation as a digital preservation service
Software curation as a digital preservation service
Keith Webster
 
ResourceSync Overview
ResourceSync OverviewResourceSync Overview
ResourceSync Overview
Herbert Van de Sompel
 
Piloting Linked Data to Connect Library and Archive Resources to the New Worl...
Piloting Linked Data to Connect Library and Archive Resources to the New Worl...Piloting Linked Data to Connect Library and Archive Resources to the New Worl...
Piloting Linked Data to Connect Library and Archive Resources to the New Worl...
Laura Akerman
 
Carpenter/Lagace: NISO Recommended Practices to Support Adoption of Altmetric...
Carpenter/Lagace: NISO Recommended Practices to Support Adoption of Altmetric...Carpenter/Lagace: NISO Recommended Practices to Support Adoption of Altmetric...
Carpenter/Lagace: NISO Recommended Practices to Support Adoption of Altmetric...
National Information Standards Organization (NISO)
 
Student-Driven Innovation
Student-Driven InnovationStudent-Driven Innovation
Student-Driven Innovation
Kevin Rundblad
 

Viewers also liked (6)

Publishing Ada: A Retrospective Look at the First Three Years of an Open Peer...
Publishing Ada: A Retrospective Look at the First Three Years of an Open Peer...Publishing Ada: A Retrospective Look at the First Three Years of an Open Peer...
Publishing Ada: A Retrospective Look at the First Three Years of an Open Peer...
 
Software curation as a digital preservation service
Software curation as a digital preservation serviceSoftware curation as a digital preservation service
Software curation as a digital preservation service
 
ResourceSync Overview
ResourceSync OverviewResourceSync Overview
ResourceSync Overview
 
Piloting Linked Data to Connect Library and Archive Resources to the New Worl...
Piloting Linked Data to Connect Library and Archive Resources to the New Worl...Piloting Linked Data to Connect Library and Archive Resources to the New Worl...
Piloting Linked Data to Connect Library and Archive Resources to the New Worl...
 
Carpenter/Lagace: NISO Recommended Practices to Support Adoption of Altmetric...
Carpenter/Lagace: NISO Recommended Practices to Support Adoption of Altmetric...Carpenter/Lagace: NISO Recommended Practices to Support Adoption of Altmetric...
Carpenter/Lagace: NISO Recommended Practices to Support Adoption of Altmetric...
 
Student-Driven Innovation
Student-Driven InnovationStudent-Driven Innovation
Student-Driven Innovation
 

Similar to Integration of research literature and data (InFoLiS)

Connecting GESIS research data and publication information systems – Katarina...
Connecting GESIS research data and publication information systems – Katarina...Connecting GESIS research data and publication information systems – Katarina...
Connecting GESIS research data and publication information systems – Katarina...
OpenAIRE
 
Riding the wave - Paradigm shifts in information access
Riding the wave - Paradigm shifts in information accessRiding the wave - Paradigm shifts in information access
Riding the wave - Paradigm shifts in information access
datacite
 
Semantic Linking & Retrieval for Digital Libraries
Semantic Linking & Retrieval for Digital LibrariesSemantic Linking & Retrieval for Digital Libraries
Semantic Linking & Retrieval for Digital Libraries
Stefan Dietze
 
Decomposing Social and Semantic Networks in Emerging “Big Data” Research
Decomposing Social and Semantic Networks in Emerging “Big Data” ResearchDecomposing Social and Semantic Networks in Emerging “Big Data” Research
Decomposing Social and Semantic Networks in Emerging “Big Data” Research
Han Woo PARK
 
euclid_linkedup WWW tutorial (Besnik Fetahu)
euclid_linkedup WWW tutorial (Besnik Fetahu)euclid_linkedup WWW tutorial (Besnik Fetahu)
euclid_linkedup WWW tutorial (Besnik Fetahu)Besnik Fetahu
 
Thinking About the Making of Data
Thinking About the Making of DataThinking About the Making of Data
Thinking About the Making of Data
Paul Groth
 
RDA-WDS Publishing Data Interest Group
RDA-WDS Publishing Data Interest GroupRDA-WDS Publishing Data Interest Group
RDA-WDS Publishing Data Interest Group
Anita de Waard
 
Tools für das Management von Forschungsdaten
Tools für das Management von ForschungsdatenTools für das Management von Forschungsdaten
Tools für das Management von Forschungsdaten
Heinz Pampel
 
Sci 2011 big_data(30_may13)2nd revised _ loet
Sci 2011 big_data(30_may13)2nd revised _ loetSci 2011 big_data(30_may13)2nd revised _ loet
Sci 2011 big_data(30_may13)2nd revised _ loetHan Woo PARK
 
Big Data Research Trend and Forecast (2005-2015): An Informetrics Perspective
Big Data Research Trend and Forecast (2005-2015): An Informetrics PerspectiveBig Data Research Trend and Forecast (2005-2015): An Informetrics Perspective
Big Data Research Trend and Forecast (2005-2015): An Informetrics Perspective
The International Journal of Business Management and Technology
 
Interlinking educational data to Web of Data (Thesis presentation)
Interlinking educational data to Web of Data (Thesis presentation)Interlinking educational data to Web of Data (Thesis presentation)
Interlinking educational data to Web of Data (Thesis presentation)
Enayat Rajabi
 
A metadata scheme of the software-data relationship: A proposal
A metadata scheme of the software-data relationship: A proposalA metadata scheme of the software-data relationship: A proposal
A metadata scheme of the software-data relationship: A proposal
Kai Li
 
The web of data: how are we doing so far
The web of data: how are we doing so farThe web of data: how are we doing so far
The web of data: how are we doing so far
Elena Simperl
 
David Shotton - Research Integrity: Integrity of the published record
David Shotton - Research Integrity: Integrity of the published recordDavid Shotton - Research Integrity: Integrity of the published record
David Shotton - Research Integrity: Integrity of the published record
Jisc
 
Martin Donnelly - Digital Data Curation at the Digital Curation Centre (DH2016)
Martin Donnelly - Digital Data Curation at the Digital Curation Centre (DH2016)Martin Donnelly - Digital Data Curation at the Digital Curation Centre (DH2016)
Martin Donnelly - Digital Data Curation at the Digital Curation Centre (DH2016)
dri_ireland
 
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...Natsuko Nicholls
 
Current and emerging scientific data curation practices
Current and emerging scientific data curation practicesCurrent and emerging scientific data curation practices
Current and emerging scientific data curation practices
Michael Day
 
Experimental research data quality in
Experimental research data quality inExperimental research data quality in
Experimental research data quality in
ijait
 
Metadata for digital long-term preservation
Metadata for digital long-term preservationMetadata for digital long-term preservation
Metadata for digital long-term preservation
Michael Day
 
Research Objects: more than the sum of the parts
Research Objects: more than the sum of the partsResearch Objects: more than the sum of the parts
Research Objects: more than the sum of the parts
Carole Goble
 

Similar to Integration of research literature and data (InFoLiS) (20)

Connecting GESIS research data and publication information systems – Katarina...
Connecting GESIS research data and publication information systems – Katarina...Connecting GESIS research data and publication information systems – Katarina...
Connecting GESIS research data and publication information systems – Katarina...
 
Riding the wave - Paradigm shifts in information access
Riding the wave - Paradigm shifts in information accessRiding the wave - Paradigm shifts in information access
Riding the wave - Paradigm shifts in information access
 
Semantic Linking & Retrieval for Digital Libraries
Semantic Linking & Retrieval for Digital LibrariesSemantic Linking & Retrieval for Digital Libraries
Semantic Linking & Retrieval for Digital Libraries
 
Decomposing Social and Semantic Networks in Emerging “Big Data” Research
Decomposing Social and Semantic Networks in Emerging “Big Data” ResearchDecomposing Social and Semantic Networks in Emerging “Big Data” Research
Decomposing Social and Semantic Networks in Emerging “Big Data” Research
 
euclid_linkedup WWW tutorial (Besnik Fetahu)
euclid_linkedup WWW tutorial (Besnik Fetahu)euclid_linkedup WWW tutorial (Besnik Fetahu)
euclid_linkedup WWW tutorial (Besnik Fetahu)
 
Thinking About the Making of Data
Thinking About the Making of DataThinking About the Making of Data
Thinking About the Making of Data
 
RDA-WDS Publishing Data Interest Group
RDA-WDS Publishing Data Interest GroupRDA-WDS Publishing Data Interest Group
RDA-WDS Publishing Data Interest Group
 
Tools für das Management von Forschungsdaten
Tools für das Management von ForschungsdatenTools für das Management von Forschungsdaten
Tools für das Management von Forschungsdaten
 
Sci 2011 big_data(30_may13)2nd revised _ loet
Sci 2011 big_data(30_may13)2nd revised _ loetSci 2011 big_data(30_may13)2nd revised _ loet
Sci 2011 big_data(30_may13)2nd revised _ loet
 
Big Data Research Trend and Forecast (2005-2015): An Informetrics Perspective
Big Data Research Trend and Forecast (2005-2015): An Informetrics PerspectiveBig Data Research Trend and Forecast (2005-2015): An Informetrics Perspective
Big Data Research Trend and Forecast (2005-2015): An Informetrics Perspective
 
Interlinking educational data to Web of Data (Thesis presentation)
Interlinking educational data to Web of Data (Thesis presentation)Interlinking educational data to Web of Data (Thesis presentation)
Interlinking educational data to Web of Data (Thesis presentation)
 
A metadata scheme of the software-data relationship: A proposal
A metadata scheme of the software-data relationship: A proposalA metadata scheme of the software-data relationship: A proposal
A metadata scheme of the software-data relationship: A proposal
 
The web of data: how are we doing so far
The web of data: how are we doing so farThe web of data: how are we doing so far
The web of data: how are we doing so far
 
David Shotton - Research Integrity: Integrity of the published record
David Shotton - Research Integrity: Integrity of the published recordDavid Shotton - Research Integrity: Integrity of the published record
David Shotton - Research Integrity: Integrity of the published record
 
Martin Donnelly - Digital Data Curation at the Digital Curation Centre (DH2016)
Martin Donnelly - Digital Data Curation at the Digital Curation Centre (DH2016)Martin Donnelly - Digital Data Curation at the Digital Curation Centre (DH2016)
Martin Donnelly - Digital Data Curation at the Digital Curation Centre (DH2016)
 
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
 
Current and emerging scientific data curation practices
Current and emerging scientific data curation practicesCurrent and emerging scientific data curation practices
Current and emerging scientific data curation practices
 
Experimental research data quality in
Experimental research data quality inExperimental research data quality in
Experimental research data quality in
 
Metadata for digital long-term preservation
Metadata for digital long-term preservationMetadata for digital long-term preservation
Metadata for digital long-term preservation
 
Research Objects: more than the sum of the parts
Research Objects: more than the sum of the partsResearch Objects: more than the sum of the parts
Research Objects: more than the sum of the parts
 

Recently uploaded

The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
alex933524
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
James Polillo
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
ArpitMalhotra16
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
correoyaya
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
benishzehra469
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
theahmadsaood
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
AlejandraGmez176757
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 

Recently uploaded (20)

The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 

Integration of research literature and data (InFoLiS)

  • 1. Integration of research literature and data (InFoLiS) Katarina Boland1 Philipp Zumstein2 1 GESIS - Leibniz Institute for the Social Sciences, Cologne, Germany 2 Mannheim University Library, Mannheim, Germany CNI 2015 Spring Membership Meeting April 14th, 2015
  • 2. the InFoLiS project: Integration of research data and publications InFoLiS I: 05/2011 - 05/2013 InFoLiS II: 08/2014 - 08/2016 InFoLiS is funded by the DFG (SU 647/2-1) Integration of research literature and data (InFoLiS) 2/22 Introduction
  • 3. Catalogue: Publications SSOAR (GESIS), Primo (UB MA), ... DataCatalogue: Research Data da|ra (GESIS), ... Query Query Response Links Response Response Response Integration of research literature and data (InFoLiS) 3/22 InFoLiS Project Goals
  • 4. 1 Part I: Generation of Links 2 Part II: How can you reuse it? Integration of research literature and data (InFoLiS) 4/22 Outline
  • 5. Integration of research literature and data (InFoLiS) 5/22 Part 1: Generation of Links
  • 6. Recommendation:1 : Creator (Publication Date): Title. Publication Agent. Identifier Creator (Publication Date): Title. Version. Publication Agent. Type of Resource. Identifier. → Extraction based on these patterns? 1 see http://auffinden-zitieren-dokumentieren.de/zitieren/empfohlene-datenzitation/ Integration of research literature and data (InFoLiS) 6/22 Citation of Research Data
  • 7. presentation and discussion of the empirical findings. For this purpose, data from the Socio-Economic Panel (SOEP) of the years 1990 and 2003 are used and for both periods, the impact factors are estimated using linear regression models. data from the title of the years year are used Integration of research literature and data (InFoLiS) 7/22 References to Datasets
  • 8. Table 1: Population forecast for Germany depending on age cohorts - proportion in percent. Data base: 10th Population Forecast of the Federal Statistical Office , variant 5. (Data base: number title of the publication agent, variant variant) Integration of research literature and data (InFoLiS) 8/22 References to Datasets
  • 9. Consulted were furthermore ... Consulted were furthermore title1, title2, title3, ..., titleN. Integration of research literature and data (InFoLiS) 9/22 References to Datasets
  • 10. Table 3: Sample of the surveys conducted in the years 2003 and 2004 as well as size of the sample, with valid data from both surveys (Source: Ditton et al. 2005a) (Source: citation of descriptive publication) Integration of research literature and data (InFoLiS) 10/22 References to Datasets
  • 11. ...are hard to detect! see also... Green, Toby (2009). We Need Publishing Standards for Datasets and Data Tables. OECD Publishing White Paper. doi: 10.1787/603233448430 Altman, Micah and Gary King (2007). A Proposed Standard for the Scholarly Citation of Quantitative Data. In: D-Lib Magazine 13.3. url: http://www.dlib.org/dlib/march07/altman/03altman.html Integration of research literature and data (InFoLiS) 11/22 References to Datasets
  • 12. Integration of research literature and data (InFoLiS) 12/22 Automatic Identification of References Why not simply search for study titles in publications?
  • 13. Integration of research literature and data (InFoLiS) 12/22 Automatic Identification of References Why not simply search for study titles in publications? “ALLBUS/GGSS 1996 (Allgemeine Bev¨olkerungsumfrage der Sozialwissenschaften/German General Social Survey 1996)”
  • 14. Integration of research literature and data (InFoLiS) 12/22 Automatic Identification of References Why not simply search for study titles in publications? “ALLBUS/GGSS 1996 (Allgemeine Bev¨olkerungsumfrage der Sozialwissenschaften/German General Social Survey 1996)” “ALLBUS 96”
  • 15. Integration of research literature and data (InFoLiS) 12/22 Automatic Identification of References Why not simply search for study titles in publications? “Youth 2010”
  • 16. How do humans recognize study references? Source: Estimations based on SOEP, wave 2002. Integration of research literature and data (InFoLiS) 13/22 General idea
  • 17. How do humans recognize study references? Source: Estimations based on xyz, wave 2002. Integration of research literature and data (InFoLiS) 13/22 General idea
  • 18. Integration of research literature and data (InFoLiS) 14/22 Algorithm
  • 19. for details see... Katarina Boland, Dominique Ritze, Kai Eckert & Brigitte Mathiak (2012). Identifying References to Datasets in Publications. In: Proceedings of the Second International Conference on Theory and Practice of Digital Libraries (TPDL), Lecture Notes in Computer Science Volume 7489, pp. 150-161. Berlin: Springer. doi:10.1007/978-3-642-33290-6 17 Integration of research literature and data (InFoLiS) 15/22 Reference Extraction
  • 20. Integration of research literature and data (InFoLiS) 16/22 Mapping to Datasets in da|ra
  • 21. Strategies: 1) greedy; 2) exact; 3) best Integration of research literature and data (InFoLiS) 17/22 Mapping to Datasets in da|ra: granularity of registration vs. citation
  • 22. ALLBUS ALLBUS 2000 ALLBUS 1996ALLBUS 1998 ALLBUS 2000 CAPI/PAPI ALLBUScompact 2000 CAPI/PAPI ALLBUScompact 2000 CAPI ALLBUS - Cumulation 1980-2006 ALLBUS - Cumulation 1980-2008ALLBUScompact - Cumulation 1980-2010 ALLBUScompact 2000 ... ... ... ...... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...... ALLBUScompact → use ontology Integration of research literature and data (InFoLiS) 18/22 Mapping to Datasets in da|ra
  • 23. Vocabulary: e.g. DDI-RDF Discovery Vocabulary2 2 Thomas Bosch, Richard Cyganiak, Arofan Gregory, Joachim Wackerow (2013): DDI-RDF Discovery Vocabulary: A Metadata Vocabulary for Documenting Research and Survey Data. In: Proceedings of the 6th Linked Data on the Web (LDOW) Workshop at the 22nd International World Wide Web Conference (WWW). CEUR Workshop Proceedings, pp. 46-55 Integration of research literature and data (InFoLiS) 19/22 Ontology: Approach
  • 24. Integration of research literature and data (InFoLiS) 20/22 Links
  • 25. Example: da|ra Example: SSOAR Integration of research literature and data (InFoLiS) 21/22 Integration of Links into Information Systems
  • 26. Thank you for your attention! katarina.boland@gesis.org Integration of research literature and data (InFoLiS) 22/22 Next part: How can you reuse it?
  • 27. Part II How can you reuse it?
  • 30. (Internal) Data structure Document Pattern Executation of Algorithm Study Title Study URI
  • 31. (Internal) Data structure Document Pattern Executation of Algorithm Study Title Study URI Which studies are found in a document?
  • 32. (Internal) Data structure Document Pattern Executation of Algorithm Study Title Study URI How was a pattern derived? Which studies are found in an document?
  • 33. (Internal) Data structure Document Pattern Executation of Algorithm Study Title Study URI Which other study titles are found with the new configuration of the algorithm? How was a pattern derived? Which studies are found in an document?
  • 36. RESTful API (web services)  GET, POST, PUT, DELETE, PATCH resources  Search, perform algorithms, upload files  open for integration into other workflows, e.g. in  ressource discovery systems  research data catalogues  digital repositories  possible to orchestrate over a web interface for individual use
  • 37. Lookup services DB (links) lookup service publication URI study URI study URI reverse lookup service publication URI
  • 38. Extraction of study URIs from a PDF pdf (fulltext) DB (patterns) pdf2txt txt (fulltext) extract study titles study URI study titles linking
  • 40. Integration of publications and research data
  • 41. Quoting the Horizon Report 2014 “Visionary leadership for research data management models is also required to determine how to best incorporate data connections into library catalogs” (NMC Horizon Report 2014 - Library Edition, p. 7)
  • 42. Current situation: Several steps needed  Common situation today:  Search online catalogue  Evaluate search results  Find fulltext to relevant source  Read the publication  Spot the research data  Moreover, often the reverse information is missing completely  Which publications are built on some specific research data?
  • 43. Clientside load additional data in catalogue view (e.g. over Ajax)  enrich view, links  up-to-date data  Embedd data in the web presentation Serverside add additional data in your catalogue database (e.g. Primo enrichement process)  enrich view, links, search, sort, filter  time-lagged because of the update mechanism  Do the data fit into existing infrastructure? (fields, tables, database) Two Approaches
  • 44. Integration as links  Link from catalogue entry ...  … to the corresponding research data
  • 45. Integration as popup Cited research data: 2 • ALLBUS 2010 (used in 512 publications) • part of ALLBUS (used in 13.456 publications) • own research data (used in 1 publications)
  • 46. Integration in search/sort Cited data sets 4 Cited data sets 1 Sort by data citation
  • 48. Enrich your research data catalogue Cited in: Ritze, D., Paulheim, H., & Eckert, K. (2013). Evaluation Measures for Ontology Matchers in Supervised Matching Scenarios. In The Semantic Web – ISWC 2013 (p. 392–407). Tags from Publication: Supervised Ontology Matching, Evaluation, Recall, Precision, F-Measure, Precision@N- Curves, ROC-Curves, Precision-Recall- Curves
  • 49. Current Goals of the Project 1. Expansion to other disciplines and languages 2. Linked data based infrastructure 3. Improve the reusability of generated links
  • 50. Dissemination  our web services will be open for everyone  project webpage  http://infolis.github.io/  background information, slides, publications, news  Additionally our code is open source  https://github.com/infolis  you can install/try out everything locally  development of code
  • 51. Questions, Discussions, Feedback  Questions?  Discussions  Give us feedback  Small online survey: http://t1p.de/infolis http://wiki.bib.uni-mannheim.de/limesurvey/index.php?sid=55594