SlideShare a Scribd company logo
Information Sciences Institute
TOWARDS KNOWLEDGE GRAPHS OF
REUSABLE RESEARCH SOFTWARE
METADATA
Daniel Garijo, Yolanda Gil, Maximiliano Osrio, Varun Ratnakar,
Deborah Khider, Hernan Vargas
Information Sciences Institute, University of Southern
California
@dgarijov
dgarijo@isi.edu
Information Sciences Institute
Is there a reproducibility crisis? [Nature, 2016]
Source: https://www.nature.com/news/1-500-scientists-lift-the-lid-on-reproducibility-1.19970
Information Sciences Institute
Reproducibility in Computational Sciences:
Open Research Data, Software and Methods
Scientific publication
Research Data Research Software Research Methods
Information Sciences Institute
Challenges for Finding, Understanding,
(Re)Using and Sharing Research Software
• What does the software component do?
Which of its methods should I use?
• How to transform my data to use the
software component?
• How to interpret the results produced by
the software component?
• How to invoke the software component?
• How to configure the software
component with the right parameters?
• How to compare software with similar
software?
Software designerSoftware user
• How to ease capturing the
dependencies and installation
instructions of my software?
• How to encapsulate my software so it
can be used with other data?
• How to describe my software so it
can be used by others?
• How to test if my software is ready to
be used by others?
• How can my component be found by
others
Information Sciences Institute
How are we addressing these challenges?
1. Describe Research Software in a machine-readable manner
2. Link and connect Research Software in Knowledge Graphs
3. Build applications for helping finding, understanding and reusing Research
Software using those Knowledge Graphs
Information Sciences Institute
1. Describing Research Software
metadata in a machine-readable
manner
Information Sciences Institute
Representing Software Metadata: OntoSoft
Crowdsourced Software Metadata Registry
• Complements code repositories to
make them understandable
• Software metadata designed for
scientists
• Metadata is curated by decentralized
communities of users
• Training scientists on best practices
http://ontosoft.org
Finding Software
OntoSoft: Capturing scientific software metadata. Gil, Y.; Ratnakar, V.; and Garijo, D. In Proceedings of
the 8th International Conference on Knowledge Capture, pages 32, 2015. ACM
Information Sciences Institute
Adding Structure to Software Metadata: OKG-Soft
Explore input/output variables
Explore Software I/O files
Knowledge Graph with machine-readable
Software Metadata:
• (From OntoSoft) Attribution, license, funding,
usage examples...
• Executable software components
• Software invocation
• Input & output files, variables and units
• Containers used to encapsulate and run
software components
[Garijo et al 2019]: OKG-Soft: An Open Knowledge Graph with Machine Readable Scientific Software Metadata. International
Conference on eScience, San Diego, USA. 2019
Information Sciences Institute
Evolving OntoSoft: Software Description Ontology
https://w3id.org/okn/o/sd#
Extensions:
• Schema.org/Codemeta (software metadata)
• W3C Data Cubes (Contents of inputs and outputs)
• NASA QUDT (Units)
• DockerPedia (Software images)
• Scientific Variables Ontology (Standard Variables)
14
Information Sciences Institute
1. Describing Research Software Metadata
2. Creating Knowledge Graphs with
Research Software Metadata
• Automatically
Information Sciences Institute
Automated Software Metadata Annotation
[Mao et al 2019]: SoMEF: A Framework for Capturing Software Metadata from its Documentation. 2019 IEEE BigData REU Symposium. Los
Angeles, 2019
whimian/pyGeoPressure
SoMEF
Description: A Python package for pore pressure
prediction...
Installation: pip install pygeopressure
Invocation: import pygeopressure as ppp
Citation: Yu, (2018). PyGeoPressure: Geopressure
Prediction in Python. Journal of Open Source Software,
3(30), 992, https://doi.org/10.21105/joss.00992
Software Metadata
Extraction Framework
Software repository
Metadata fields
(17 metadata categories):
description, installation
instructions, invocation,
citation, usage notes,
requirements, contact,
contributors, FAQ, support,
license, keywords...
https://somef.readthedocs.io/en/latest/
https://github.com/KnowledgeCaptureAndDiscovery/somef
Information Sciences Institute
SOSEN-KG: integrating Zenodo and GitHub
https://github.com/KnowledgeCaptureAndDiscovery/sosen
Prototype with > 13K entries of research software metadata
• Integrating metadata from Zenodo and GitHub (versions, authors, etc.)
• Expanding it with Wikidata (future work)
Information Sciences Institute
1. Describing Research Software Metadata
2. Creating Knowledge Graphs with
Research Software Metadata
• Automatically
• Crowdsourcing
Information Sciences Institute
OKG-SOFT
Software Model Catalog contains:
• Models from hydrology, agriculture and economy, their versions and model
configurations.
• More than 200 variables mapped to SVO.
• All models are executable through scientific workflows
• Most contents are added manually (expert users) collaboratively
• Automated unit transformations
• Automated software image description
• Semi-automated Wikidata linking
OKG-Soft: An Open Knowledge Graph with Machine Readable Scientific Software Metadata. Garijo, D.; Osorio, M.; Khider, D.; Ratnakar, V.;
and Gil, Y. In 2019 15th International Conference on eScience (eScience), pages 349–358, San Diego, CA, USA, September 2019. IEEE
Information Sciences Institute
1. Describing Research Software Metadata
2. Creating Knowledge Graphs with
Research Software Metadata
• Automatically
• Crowdsourcing
3. Using KGs to Find, Understand and Reuse
Research Software
Information Sciences Institute
OntoSoft: Comparing Software Metadata
PIHM PIHMgis DrEICH TauDEM WBMsed
Information Sciences Institute
OKG-SOFT Framework: Exploring Research
Software Model Metadata
Explore variables of inputs and outputs
Explore software I/O
Find, compare and configure
software models
http://models.mint.isi.edu
Information Sciences Institute
Research Software Reuse:
Encapsulating & Testing
Machine-
readable
component
specification
Assistants +
Guidelines
TestsTestsTests
Portable
Component
Software
Metadata
Registry OKG-SOFT
https://mic-cli.readthedocs.io/en/latest/
https://dame-cli.readthedocs.io/en/latest/
Information Sciences Institute
Summing up...
Information Sciences Institute
Overcoming the reproducibility crisis (partly)
• Research software is a critical asset for reproducible
computational experiments
• We need to improve the findability, (re)usability and
understanding of research software:
– Wider adoption
– Better comparison of similar computational methods
– Better understanding of data products
• In this presentation we covered:
– How to describe research software and its metadata
• OntoSoft, Software Description Ontology
– How to build Knowledge Graphs with research software metadata
• OntoSoft, OKG-Soft, SOSEN-KG
– How we are using KGs to help find, compare, understand and reuse
research software
Information Sciences Institute
Knowledge Capture and Discovery Group
Yolanda Gil
Varun Ratnakar
Daniel Garijo
Deborah Khider
Maximiliano Osorio
Hernan Vargas
https://knowledgecaptureanddiscovery.github.io/
Information Sciences Institute
TOWARDS KNOWLEDGE GRAPHS OF
REUSABLE RESEARCH SOFTWARE
METADATA
Daniel Garijo, Yolanda Gil, Maximiliano Osrio, Varun Ratnakar,
Deborah Khider, Hernan Vargas
Information Sciences Institute, University of Southern
California
@dgarijov
dgarijo@isi.edu

More Related Content

What's hot

OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...
dgarijo
 
Coming to terms to FAIR semantics
Coming to terms to FAIR semanticsComing to terms to FAIR semantics
Coming to terms to FAIR semantics
María Poveda Villalón
 
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...
dgarijo
 
Towards Automating Data Narratives
Towards Automating Data NarrativesTowards Automating Data Narratives
Towards Automating Data Narratives
dgarijo
 
FAIRer Research
FAIRer ResearchFAIRer Research
FAIRer Research
Carole Goble
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
Carole Goble
 
The Research Object Initiative: Frameworks and Use Cases
The Research Object Initiative:Frameworks and Use CasesThe Research Object Initiative:Frameworks and Use Cases
The Research Object Initiative: Frameworks and Use Cases
Carole Goble
 
Software Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceSoftware Sustainability: Better Software Better Science
Software Sustainability: Better Software Better Science
Carole Goble
 
Research Objects, SEEK and FAIRDOM
Research Objects, SEEK and FAIRDOMResearch Objects, SEEK and FAIRDOM
Research Objects, SEEK and FAIRDOM
Carole Goble
 
Recommendations for selection process automation in systematic reviews
Recommendations for selection process automation in systematic reviewsRecommendations for selection process automation in systematic reviews
Recommendations for selection process automation in systematic reviews
Faisal Razzak
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Carole Goble
 
The swings and roundabouts of a decade of fun and games with Research Objects
The swings and roundabouts of a decade of fun and games with Research Objects The swings and roundabouts of a decade of fun and games with Research Objects
The swings and roundabouts of a decade of fun and games with Research Objects
Carole Goble
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
Carole Goble
 
FAIR Data and Model Management for Systems Biology (and SOPs too!)
FAIR Data and Model Management for Systems Biology(and SOPs too!)FAIR Data and Model Management for Systems Biology(and SOPs too!)
FAIR Data and Model Management for Systems Biology (and SOPs too!)
Carole Goble
 
Research Object Community Update
Research Object Community UpdateResearch Object Community Update
Research Object Community Update
Carole Goble
 
RO-Crate: A framework for packaging research products into FAIR Research Objects
RO-Crate: A framework for packaging research products into FAIR Research ObjectsRO-Crate: A framework for packaging research products into FAIR Research Objects
RO-Crate: A framework for packaging research products into FAIR Research Objects
Carole Goble
 
Software Metadata: Describing "dark software" in GeoSciences
Software Metadata: Describing "dark software" in GeoSciencesSoftware Metadata: Describing "dark software" in GeoSciences
Software Metadata: Describing "dark software" in GeoSciences
dgarijo
 
Publishing your research: Research Data Management (Introduction)
Publishing your research: Research Data Management (Introduction) Publishing your research: Research Data Management (Introduction)
Publishing your research: Research Data Management (Introduction)
Jamie Bisset
 
FAIR History and the Future
FAIR History and the FutureFAIR History and the Future
FAIR History and the Future
Carole Goble
 
FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...
Carole Goble
 

What's hot (20)

OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...
 
Coming to terms to FAIR semantics
Coming to terms to FAIR semanticsComing to terms to FAIR semantics
Coming to terms to FAIR semantics
 
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...
 
Towards Automating Data Narratives
Towards Automating Data NarrativesTowards Automating Data Narratives
Towards Automating Data Narratives
 
FAIRer Research
FAIRer ResearchFAIRer Research
FAIRer Research
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
The Research Object Initiative: Frameworks and Use Cases
The Research Object Initiative:Frameworks and Use CasesThe Research Object Initiative:Frameworks and Use Cases
The Research Object Initiative: Frameworks and Use Cases
 
Software Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceSoftware Sustainability: Better Software Better Science
Software Sustainability: Better Software Better Science
 
Research Objects, SEEK and FAIRDOM
Research Objects, SEEK and FAIRDOMResearch Objects, SEEK and FAIRDOM
Research Objects, SEEK and FAIRDOM
 
Recommendations for selection process automation in systematic reviews
Recommendations for selection process automation in systematic reviewsRecommendations for selection process automation in systematic reviews
Recommendations for selection process automation in systematic reviews
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
 
The swings and roundabouts of a decade of fun and games with Research Objects
The swings and roundabouts of a decade of fun and games with Research Objects The swings and roundabouts of a decade of fun and games with Research Objects
The swings and roundabouts of a decade of fun and games with Research Objects
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
FAIR Data and Model Management for Systems Biology (and SOPs too!)
FAIR Data and Model Management for Systems Biology(and SOPs too!)FAIR Data and Model Management for Systems Biology(and SOPs too!)
FAIR Data and Model Management for Systems Biology (and SOPs too!)
 
Research Object Community Update
Research Object Community UpdateResearch Object Community Update
Research Object Community Update
 
RO-Crate: A framework for packaging research products into FAIR Research Objects
RO-Crate: A framework for packaging research products into FAIR Research ObjectsRO-Crate: A framework for packaging research products into FAIR Research Objects
RO-Crate: A framework for packaging research products into FAIR Research Objects
 
Software Metadata: Describing "dark software" in GeoSciences
Software Metadata: Describing "dark software" in GeoSciencesSoftware Metadata: Describing "dark software" in GeoSciences
Software Metadata: Describing "dark software" in GeoSciences
 
Publishing your research: Research Data Management (Introduction)
Publishing your research: Research Data Management (Introduction) Publishing your research: Research Data Management (Introduction)
Publishing your research: Research Data Management (Introduction)
 
FAIR History and the Future
FAIR History and the FutureFAIR History and the Future
FAIR History and the Future
 
FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...FAIR Data, Operations and Model management for Systems Biology and Systems Me...
FAIR Data, Operations and Model management for Systems Biology and Systems Me...
 

Similar to Towards Knowledge Graphs of Reusable Research Software Metadata

OntoSoft: A Distributed Semantic Registry for Scientific Software
OntoSoft: A Distributed Semantic Registry for Scientific SoftwareOntoSoft: A Distributed Semantic Registry for Scientific Software
OntoSoft: A Distributed Semantic Registry for Scientific Software
dgarijo
 
Crediting informatics and data folks in life science teams
Crediting informatics and data folks in life science teamsCrediting informatics and data folks in life science teams
Crediting informatics and data folks in life science teams
Carole Goble
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Anita de Waard
 
Better Software, Better Research
Better Software, Better ResearchBetter Software, Better Research
Better Software, Better Research
Carole Goble
 
PhD Defense Øyvind Hauge
PhD Defense Øyvind HaugePhD Defense Øyvind Hauge
PhD Defense Øyvind Hauge
Øyvind Hauge
 
Tag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh PlatformTag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh Platform
Sanjay Padhi, Ph.D
 
A New Paradigm on Analytic-Driven Information and Automation V2.pdf
A New Paradigm on Analytic-Driven Information and Automation V2.pdfA New Paradigm on Analytic-Driven Information and Automation V2.pdf
A New Paradigm on Analytic-Driven Information and Automation V2.pdf
ArmyTrilidiaDevegaSK
 
ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012
ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012
ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012
Lee Dirks
 
Visualization for Software Analytics
Visualization for Software AnalyticsVisualization for Software Analytics
Visualization for Software Analytics
Margaret-Anne Storey
 
Jesse Xiao at CODATA2017: Updates to the GigaDB open access data publishing p...
Jesse Xiao at CODATA2017: Updates to the GigaDB open access data publishing p...Jesse Xiao at CODATA2017: Updates to the GigaDB open access data publishing p...
Jesse Xiao at CODATA2017: Updates to the GigaDB open access data publishing p...
GigaScience, BGI Hong Kong
 
Fsci 2018 friday3_august_am6
Fsci 2018 friday3_august_am6Fsci 2018 friday3_august_am6
Fsci 2018 friday3_august_am6
ARDC
 
Simbios - Open Science in Biocomputational Research
Simbios - Open Science in Biocomputational ResearchSimbios - Open Science in Biocomputational Research
Simbios - Open Science in Biocomputational Research
jpk
 
IRJET- Comparative Analysis of Various Tools for Data Mining and Big Data...
IRJET-  	  Comparative Analysis of Various Tools for Data Mining and Big Data...IRJET-  	  Comparative Analysis of Various Tools for Data Mining and Big Data...
IRJET- Comparative Analysis of Various Tools for Data Mining and Big Data...
IRJET Journal
 
Towards Knowledge Graph based Representation, Augmentation and Exploration of...
Towards Knowledge Graph based Representation, Augmentation and Exploration of...Towards Knowledge Graph based Representation, Augmentation and Exploration of...
Towards Knowledge Graph based Representation, Augmentation and Exploration of...
Sören Auer
 
Towards a Community-driven Data Science Body of Knowledge – Data Management S...
Towards a Community-driven Data Science Body of Knowledge – Data Management S...Towards a Community-driven Data Science Body of Knowledge – Data Management S...
Towards a Community-driven Data Science Body of Knowledge – Data Management S...
Research Data Alliance
 
Recovery of Traceability Links and Behavior Models for Software Maintenance,...
Recovery of Traceability Links and Behavior Models for Software Maintenance,...Recovery of Traceability Links and Behavior Models for Software Maintenance,...
Recovery of Traceability Links and Behavior Models for Software Maintenance,...
Hironori Washizaki
 
Hahnel "Open Data Policies: Opportunities, compliance and technology strategies"
Hahnel "Open Data Policies: Opportunities, compliance and technology strategies"Hahnel "Open Data Policies: Opportunities, compliance and technology strategies"
Hahnel "Open Data Policies: Opportunities, compliance and technology strategies"
National Information Standards Organization (NISO)
 
IOT-2016 7-9 Septermber, 2016, Stuttgart, Germany
IOT-2016  7-9 Septermber, 2016, Stuttgart, GermanyIOT-2016  7-9 Septermber, 2016, Stuttgart, Germany
IOT-2016 7-9 Septermber, 2016, Stuttgart, Germany
Charith Perera
 
The Experimental Project of DOI Registration for Research Data at Japan Link...
The Experimental Project of DOI Registration for Research Data at Japan Link...The Experimental Project of DOI Registration for Research Data at Japan Link...
The Experimental Project of DOI Registration for Research Data at Japan Link...
National Institute of Informatics (NII)
 
Lies, Damned Lies and Software Analytics: Why Big Data Needs Rich Data
Lies, Damned Lies and Software Analytics:  Why Big Data Needs Rich DataLies, Damned Lies and Software Analytics:  Why Big Data Needs Rich Data
Lies, Damned Lies and Software Analytics: Why Big Data Needs Rich Data
Margaret-Anne Storey
 

Similar to Towards Knowledge Graphs of Reusable Research Software Metadata (20)

OntoSoft: A Distributed Semantic Registry for Scientific Software
OntoSoft: A Distributed Semantic Registry for Scientific SoftwareOntoSoft: A Distributed Semantic Registry for Scientific Software
OntoSoft: A Distributed Semantic Registry for Scientific Software
 
Crediting informatics and data folks in life science teams
Crediting informatics and data folks in life science teamsCrediting informatics and data folks in life science teams
Crediting informatics and data folks in life science teams
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
 
Better Software, Better Research
Better Software, Better ResearchBetter Software, Better Research
Better Software, Better Research
 
PhD Defense Øyvind Hauge
PhD Defense Øyvind HaugePhD Defense Øyvind Hauge
PhD Defense Øyvind Hauge
 
Tag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh PlatformTag.bio: Self Service Data Mesh Platform
Tag.bio: Self Service Data Mesh Platform
 
A New Paradigm on Analytic-Driven Information and Automation V2.pdf
A New Paradigm on Analytic-Driven Information and Automation V2.pdfA New Paradigm on Analytic-Driven Information and Automation V2.pdf
A New Paradigm on Analytic-Driven Information and Automation V2.pdf
 
ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012
ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012
ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012
 
Visualization for Software Analytics
Visualization for Software AnalyticsVisualization for Software Analytics
Visualization for Software Analytics
 
Jesse Xiao at CODATA2017: Updates to the GigaDB open access data publishing p...
Jesse Xiao at CODATA2017: Updates to the GigaDB open access data publishing p...Jesse Xiao at CODATA2017: Updates to the GigaDB open access data publishing p...
Jesse Xiao at CODATA2017: Updates to the GigaDB open access data publishing p...
 
Fsci 2018 friday3_august_am6
Fsci 2018 friday3_august_am6Fsci 2018 friday3_august_am6
Fsci 2018 friday3_august_am6
 
Simbios - Open Science in Biocomputational Research
Simbios - Open Science in Biocomputational ResearchSimbios - Open Science in Biocomputational Research
Simbios - Open Science in Biocomputational Research
 
IRJET- Comparative Analysis of Various Tools for Data Mining and Big Data...
IRJET-  	  Comparative Analysis of Various Tools for Data Mining and Big Data...IRJET-  	  Comparative Analysis of Various Tools for Data Mining and Big Data...
IRJET- Comparative Analysis of Various Tools for Data Mining and Big Data...
 
Towards Knowledge Graph based Representation, Augmentation and Exploration of...
Towards Knowledge Graph based Representation, Augmentation and Exploration of...Towards Knowledge Graph based Representation, Augmentation and Exploration of...
Towards Knowledge Graph based Representation, Augmentation and Exploration of...
 
Towards a Community-driven Data Science Body of Knowledge – Data Management S...
Towards a Community-driven Data Science Body of Knowledge – Data Management S...Towards a Community-driven Data Science Body of Knowledge – Data Management S...
Towards a Community-driven Data Science Body of Knowledge – Data Management S...
 
Recovery of Traceability Links and Behavior Models for Software Maintenance,...
Recovery of Traceability Links and Behavior Models for Software Maintenance,...Recovery of Traceability Links and Behavior Models for Software Maintenance,...
Recovery of Traceability Links and Behavior Models for Software Maintenance,...
 
Hahnel "Open Data Policies: Opportunities, compliance and technology strategies"
Hahnel "Open Data Policies: Opportunities, compliance and technology strategies"Hahnel "Open Data Policies: Opportunities, compliance and technology strategies"
Hahnel "Open Data Policies: Opportunities, compliance and technology strategies"
 
IOT-2016 7-9 Septermber, 2016, Stuttgart, Germany
IOT-2016  7-9 Septermber, 2016, Stuttgart, GermanyIOT-2016  7-9 Septermber, 2016, Stuttgart, Germany
IOT-2016 7-9 Septermber, 2016, Stuttgart, Germany
 
The Experimental Project of DOI Registration for Research Data at Japan Link...
The Experimental Project of DOI Registration for Research Data at Japan Link...The Experimental Project of DOI Registration for Research Data at Japan Link...
The Experimental Project of DOI Registration for Research Data at Japan Link...
 
Lies, Damned Lies and Software Analytics: Why Big Data Needs Rich Data
Lies, Damned Lies and Software Analytics:  Why Big Data Needs Rich DataLies, Damned Lies and Software Analytics:  Why Big Data Needs Rich Data
Lies, Damned Lies and Software Analytics: Why Big Data Needs Rich Data
 

More from dgarijo

WDPlus: Leveraging Wikidata to Link and Extend Tabular Data
WDPlus: Leveraging Wikidata to Link and Extend Tabular DataWDPlus: Leveraging Wikidata to Link and Extend Tabular Data
WDPlus: Leveraging Wikidata to Link and Extend Tabular Data
dgarijo
 
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven ScienceCapturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
dgarijo
 
WIDOCO: A Wizard for Documenting Ontologies
WIDOCO: A Wizard for Documenting OntologiesWIDOCO: A Wizard for Documenting Ontologies
WIDOCO: A Wizard for Documenting Ontologies
dgarijo
 
Automated Hypothesis Testing with Large Scale Scientific Workflows
Automated Hypothesis Testing with Large Scale Scientific WorkflowsAutomated Hypothesis Testing with Large Scale Scientific Workflows
Automated Hypothesis Testing with Large Scale Scientific Workflows
dgarijo
 
OEG tools for supporting Ontology Engineering
OEG tools for supporting Ontology EngineeringOEG tools for supporting Ontology Engineering
OEG tools for supporting Ontology Engineering
dgarijo
 
Reproducibility Using Semantics: An Overview
Reproducibility Using Semantics: An OverviewReproducibility Using Semantics: An Overview
Reproducibility Using Semantics: An Overview
dgarijo
 
PhD Thesis: Mining abstractions in scientific workflows
PhD Thesis: Mining abstractions in scientific workflowsPhD Thesis: Mining abstractions in scientific workflows
PhD Thesis: Mining abstractions in scientific workflows
dgarijo
 
Publicación de datos y métodos científicos en investigación
Publicación de datos y métodos científicos en investigaciónPublicación de datos y métodos científicos en investigación
Publicación de datos y métodos científicos en investigación
dgarijo
 
EDBT 2015: Summer School Overview
EDBT 2015: Summer School OverviewEDBT 2015: Summer School Overview
EDBT 2015: Summer School Overview
dgarijo
 
Similarity in Wikipedia Articles (EDBT Summer School)
Similarity in Wikipedia Articles (EDBT Summer School)Similarity in Wikipedia Articles (EDBT Summer School)
Similarity in Wikipedia Articles (EDBT Summer School)
dgarijo
 
Semantic web 101: Benefits for geologists
Semantic web 101: Benefits for geologistsSemantic web 101: Benefits for geologists
Semantic web 101: Benefits for geologists
dgarijo
 
Is preserving data enough? Towards the preservation of scientific methods
Is preserving data enough? Towards the preservation of scientific methods Is preserving data enough? Towards the preservation of scientific methods
Is preserving data enough? Towards the preservation of scientific methods
dgarijo
 
Creating abstractions from scientific workflows: PhD symposium 2015
Creating abstractions from scientific workflows: PhD symposium 2015Creating abstractions from scientific workflows: PhD symposium 2015
Creating abstractions from scientific workflows: PhD symposium 2015
dgarijo
 
Towards Workflow Ecosystems Through Semantic and Standard Representations
Towards Workflow Ecosystems Through Semantic and Standard RepresentationsTowards Workflow Ecosystems Through Semantic and Standard Representations
Towards Workflow Ecosystems Through Semantic and Standard Representations
dgarijo
 
Workflow Reuse in Practice: A Study of Neuroimaging Pipeline Users
Workflow Reuse in Practice: A Study of Neuroimaging Pipeline UsersWorkflow Reuse in Practice: A Study of Neuroimaging Pipeline Users
Workflow Reuse in Practice: A Study of Neuroimaging Pipeline Users
dgarijo
 
Frag Flow: Automated Fragment Detection in Scientific Workflows
Frag Flow: Automated Fragment Detection in Scientific WorkflowsFrag Flow: Automated Fragment Detection in Scientific Workflows
Frag Flow: Automated Fragment Detection in Scientific Workflows
dgarijo
 
User requirments for geospatial provenance
User requirments for geospatial provenanceUser requirments for geospatial provenance
User requirments for geospatial provenance
dgarijo
 

More from dgarijo (17)

WDPlus: Leveraging Wikidata to Link and Extend Tabular Data
WDPlus: Leveraging Wikidata to Link and Extend Tabular DataWDPlus: Leveraging Wikidata to Link and Extend Tabular Data
WDPlus: Leveraging Wikidata to Link and Extend Tabular Data
 
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven ScienceCapturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
 
WIDOCO: A Wizard for Documenting Ontologies
WIDOCO: A Wizard for Documenting OntologiesWIDOCO: A Wizard for Documenting Ontologies
WIDOCO: A Wizard for Documenting Ontologies
 
Automated Hypothesis Testing with Large Scale Scientific Workflows
Automated Hypothesis Testing with Large Scale Scientific WorkflowsAutomated Hypothesis Testing with Large Scale Scientific Workflows
Automated Hypothesis Testing with Large Scale Scientific Workflows
 
OEG tools for supporting Ontology Engineering
OEG tools for supporting Ontology EngineeringOEG tools for supporting Ontology Engineering
OEG tools for supporting Ontology Engineering
 
Reproducibility Using Semantics: An Overview
Reproducibility Using Semantics: An OverviewReproducibility Using Semantics: An Overview
Reproducibility Using Semantics: An Overview
 
PhD Thesis: Mining abstractions in scientific workflows
PhD Thesis: Mining abstractions in scientific workflowsPhD Thesis: Mining abstractions in scientific workflows
PhD Thesis: Mining abstractions in scientific workflows
 
Publicación de datos y métodos científicos en investigación
Publicación de datos y métodos científicos en investigaciónPublicación de datos y métodos científicos en investigación
Publicación de datos y métodos científicos en investigación
 
EDBT 2015: Summer School Overview
EDBT 2015: Summer School OverviewEDBT 2015: Summer School Overview
EDBT 2015: Summer School Overview
 
Similarity in Wikipedia Articles (EDBT Summer School)
Similarity in Wikipedia Articles (EDBT Summer School)Similarity in Wikipedia Articles (EDBT Summer School)
Similarity in Wikipedia Articles (EDBT Summer School)
 
Semantic web 101: Benefits for geologists
Semantic web 101: Benefits for geologistsSemantic web 101: Benefits for geologists
Semantic web 101: Benefits for geologists
 
Is preserving data enough? Towards the preservation of scientific methods
Is preserving data enough? Towards the preservation of scientific methods Is preserving data enough? Towards the preservation of scientific methods
Is preserving data enough? Towards the preservation of scientific methods
 
Creating abstractions from scientific workflows: PhD symposium 2015
Creating abstractions from scientific workflows: PhD symposium 2015Creating abstractions from scientific workflows: PhD symposium 2015
Creating abstractions from scientific workflows: PhD symposium 2015
 
Towards Workflow Ecosystems Through Semantic and Standard Representations
Towards Workflow Ecosystems Through Semantic and Standard RepresentationsTowards Workflow Ecosystems Through Semantic and Standard Representations
Towards Workflow Ecosystems Through Semantic and Standard Representations
 
Workflow Reuse in Practice: A Study of Neuroimaging Pipeline Users
Workflow Reuse in Practice: A Study of Neuroimaging Pipeline UsersWorkflow Reuse in Practice: A Study of Neuroimaging Pipeline Users
Workflow Reuse in Practice: A Study of Neuroimaging Pipeline Users
 
Frag Flow: Automated Fragment Detection in Scientific Workflows
Frag Flow: Automated Fragment Detection in Scientific WorkflowsFrag Flow: Automated Fragment Detection in Scientific Workflows
Frag Flow: Automated Fragment Detection in Scientific Workflows
 
User requirments for geospatial provenance
User requirments for geospatial provenanceUser requirments for geospatial provenance
User requirments for geospatial provenance
 

Recently uploaded

ANATOMY OF SOA - Thomas Erl - Service Oriented Architecture
ANATOMY OF SOA - Thomas Erl - Service Oriented ArchitectureANATOMY OF SOA - Thomas Erl - Service Oriented Architecture
ANATOMY OF SOA - Thomas Erl - Service Oriented Architecture
Divya Rajasekar
 
LOCAL-BUDGET-CIRCULAR-NO-158-DATED-JULY-11-2024.pdf
LOCAL-BUDGET-CIRCULAR-NO-158-DATED-JULY-11-2024.pdfLOCAL-BUDGET-CIRCULAR-NO-158-DATED-JULY-11-2024.pdf
LOCAL-BUDGET-CIRCULAR-NO-158-DATED-JULY-11-2024.pdf
jellyjm
 
杨洋李一桐做爱视频流出【网芷:ht28.co】国产国产午夜精华>>>[网趾:ht28.co】]<<<
杨洋李一桐做爱视频流出【网芷:ht28.co】国产国产午夜精华>>>[网趾:ht28.co】]<<<杨洋李一桐做爱视频流出【网芷:ht28.co】国产国产午夜精华>>>[网趾:ht28.co】]<<<
杨洋李一桐做爱视频流出【网芷:ht28.co】国产国产午夜精华>>>[网趾:ht28.co】]<<<
amzhoxvzidbke
 
High Profile Girls Call Delhi 9711199171 Provide Best And Top Girl Service An...
High Profile Girls Call Delhi 9711199171 Provide Best And Top Girl Service An...High Profile Girls Call Delhi 9711199171 Provide Best And Top Girl Service An...
High Profile Girls Call Delhi 9711199171 Provide Best And Top Girl Service An...
janvikumar4133
 
20240710 ISSIP GGG Qtrly Community Connection Slides.pptx
20240710 ISSIP GGG Qtrly Community Connection Slides.pptx20240710 ISSIP GGG Qtrly Community Connection Slides.pptx
20240710 ISSIP GGG Qtrly Community Connection Slides.pptx
VaishaliM24
 
1. DEE 1203 ELECTRICAL ENGINEERING DRAWING.pdf
1. DEE 1203 ELECTRICAL ENGINEERING DRAWING.pdf1. DEE 1203 ELECTRICAL ENGINEERING DRAWING.pdf
1. DEE 1203 ELECTRICAL ENGINEERING DRAWING.pdf
AsiimweJulius2
 
NOVEC 1230 Fire Suppression System Presentation
NOVEC 1230 Fire Suppression System PresentationNOVEC 1230 Fire Suppression System Presentation
NOVEC 1230 Fire Suppression System Presentation
miniruwan1
 
Updated Limitations of Simplified Methods for Evaluating the Potential for Li...
Updated Limitations of Simplified Methods for Evaluating the Potential for Li...Updated Limitations of Simplified Methods for Evaluating the Potential for Li...
Updated Limitations of Simplified Methods for Evaluating the Potential for Li...
Robert Pyke
 
JORC_Review_presentation. 2024 código jorcpdf
JORC_Review_presentation. 2024 código jorcpdfJORC_Review_presentation. 2024 código jorcpdf
JORC_Review_presentation. 2024 código jorcpdf
WilliamsNuezEspetia
 
charting the development of the autonomous train
charting the development of the autonomous traincharting the development of the autonomous train
charting the development of the autonomous train
huseindihon
 
DPWH - DEPARTMENT OF PUBLIC WORKS AND HIGHWAYS
DPWH - DEPARTMENT OF PUBLIC WORKS AND HIGHWAYSDPWH - DEPARTMENT OF PUBLIC WORKS AND HIGHWAYS
DPWH - DEPARTMENT OF PUBLIC WORKS AND HIGHWAYS
RyanMacayan
 
AFCAT STATIC Genral knowledge important CAPSULE.pdf
AFCAT STATIC Genral knowledge important CAPSULE.pdfAFCAT STATIC Genral knowledge important CAPSULE.pdf
AFCAT STATIC Genral knowledge important CAPSULE.pdf
vibhapatil140
 
Indian Railway Signalling concepts and basics.pdf
Indian Railway Signalling concepts and basics.pdfIndian Railway Signalling concepts and basics.pdf
Indian Railway Signalling concepts and basics.pdf
princeshah76
 
RMC FPV.docx_fpv solar energy panels----
RMC FPV.docx_fpv solar energy panels----RMC FPV.docx_fpv solar energy panels----
RMC FPV.docx_fpv solar energy panels----
Khader Mallah
 
Disaster Management and Mitigation presentation
Disaster Management and Mitigation presentationDisaster Management and Mitigation presentation
Disaster Management and Mitigation presentation
RajaRamannaTarigoppu
 
Girls Call Mysore 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Mysore 000XX00000 Provide Best And Top Girl Service And No1 in CityGirls Call Mysore 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Mysore 000XX00000 Provide Best And Top Girl Service And No1 in City
rawankhanlove256
 
Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 in CityGirls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 in City
sunnuchadda
 
Red Hat Enterprise Linux Administration 9.0 RH124 pdf
Red Hat Enterprise Linux Administration 9.0 RH124 pdfRed Hat Enterprise Linux Administration 9.0 RH124 pdf
Red Hat Enterprise Linux Administration 9.0 RH124 pdf
mdfkobir
 
the potential for the development of autonomous aircraft
the potential for the development of autonomous aircraftthe potential for the development of autonomous aircraft
the potential for the development of autonomous aircraft
huseindihon
 
Lab session on Robot Control using teach pendant.pptx
Lab session on Robot Control using teach pendant.pptxLab session on Robot Control using teach pendant.pptx
Lab session on Robot Control using teach pendant.pptx
KPavanKumarReddy4
 

Recently uploaded (20)

ANATOMY OF SOA - Thomas Erl - Service Oriented Architecture
ANATOMY OF SOA - Thomas Erl - Service Oriented ArchitectureANATOMY OF SOA - Thomas Erl - Service Oriented Architecture
ANATOMY OF SOA - Thomas Erl - Service Oriented Architecture
 
LOCAL-BUDGET-CIRCULAR-NO-158-DATED-JULY-11-2024.pdf
LOCAL-BUDGET-CIRCULAR-NO-158-DATED-JULY-11-2024.pdfLOCAL-BUDGET-CIRCULAR-NO-158-DATED-JULY-11-2024.pdf
LOCAL-BUDGET-CIRCULAR-NO-158-DATED-JULY-11-2024.pdf
 
杨洋李一桐做爱视频流出【网芷:ht28.co】国产国产午夜精华>>>[网趾:ht28.co】]<<<
杨洋李一桐做爱视频流出【网芷:ht28.co】国产国产午夜精华>>>[网趾:ht28.co】]<<<杨洋李一桐做爱视频流出【网芷:ht28.co】国产国产午夜精华>>>[网趾:ht28.co】]<<<
杨洋李一桐做爱视频流出【网芷:ht28.co】国产国产午夜精华>>>[网趾:ht28.co】]<<<
 
High Profile Girls Call Delhi 9711199171 Provide Best And Top Girl Service An...
High Profile Girls Call Delhi 9711199171 Provide Best And Top Girl Service An...High Profile Girls Call Delhi 9711199171 Provide Best And Top Girl Service An...
High Profile Girls Call Delhi 9711199171 Provide Best And Top Girl Service An...
 
20240710 ISSIP GGG Qtrly Community Connection Slides.pptx
20240710 ISSIP GGG Qtrly Community Connection Slides.pptx20240710 ISSIP GGG Qtrly Community Connection Slides.pptx
20240710 ISSIP GGG Qtrly Community Connection Slides.pptx
 
1. DEE 1203 ELECTRICAL ENGINEERING DRAWING.pdf
1. DEE 1203 ELECTRICAL ENGINEERING DRAWING.pdf1. DEE 1203 ELECTRICAL ENGINEERING DRAWING.pdf
1. DEE 1203 ELECTRICAL ENGINEERING DRAWING.pdf
 
NOVEC 1230 Fire Suppression System Presentation
NOVEC 1230 Fire Suppression System PresentationNOVEC 1230 Fire Suppression System Presentation
NOVEC 1230 Fire Suppression System Presentation
 
Updated Limitations of Simplified Methods for Evaluating the Potential for Li...
Updated Limitations of Simplified Methods for Evaluating the Potential for Li...Updated Limitations of Simplified Methods for Evaluating the Potential for Li...
Updated Limitations of Simplified Methods for Evaluating the Potential for Li...
 
JORC_Review_presentation. 2024 código jorcpdf
JORC_Review_presentation. 2024 código jorcpdfJORC_Review_presentation. 2024 código jorcpdf
JORC_Review_presentation. 2024 código jorcpdf
 
charting the development of the autonomous train
charting the development of the autonomous traincharting the development of the autonomous train
charting the development of the autonomous train
 
DPWH - DEPARTMENT OF PUBLIC WORKS AND HIGHWAYS
DPWH - DEPARTMENT OF PUBLIC WORKS AND HIGHWAYSDPWH - DEPARTMENT OF PUBLIC WORKS AND HIGHWAYS
DPWH - DEPARTMENT OF PUBLIC WORKS AND HIGHWAYS
 
AFCAT STATIC Genral knowledge important CAPSULE.pdf
AFCAT STATIC Genral knowledge important CAPSULE.pdfAFCAT STATIC Genral knowledge important CAPSULE.pdf
AFCAT STATIC Genral knowledge important CAPSULE.pdf
 
Indian Railway Signalling concepts and basics.pdf
Indian Railway Signalling concepts and basics.pdfIndian Railway Signalling concepts and basics.pdf
Indian Railway Signalling concepts and basics.pdf
 
RMC FPV.docx_fpv solar energy panels----
RMC FPV.docx_fpv solar energy panels----RMC FPV.docx_fpv solar energy panels----
RMC FPV.docx_fpv solar energy panels----
 
Disaster Management and Mitigation presentation
Disaster Management and Mitigation presentationDisaster Management and Mitigation presentation
Disaster Management and Mitigation presentation
 
Girls Call Mysore 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Mysore 000XX00000 Provide Best And Top Girl Service And No1 in CityGirls Call Mysore 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Mysore 000XX00000 Provide Best And Top Girl Service And No1 in City
 
Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 in CityGirls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 in City
 
Red Hat Enterprise Linux Administration 9.0 RH124 pdf
Red Hat Enterprise Linux Administration 9.0 RH124 pdfRed Hat Enterprise Linux Administration 9.0 RH124 pdf
Red Hat Enterprise Linux Administration 9.0 RH124 pdf
 
the potential for the development of autonomous aircraft
the potential for the development of autonomous aircraftthe potential for the development of autonomous aircraft
the potential for the development of autonomous aircraft
 
Lab session on Robot Control using teach pendant.pptx
Lab session on Robot Control using teach pendant.pptxLab session on Robot Control using teach pendant.pptx
Lab session on Robot Control using teach pendant.pptx
 

Towards Knowledge Graphs of Reusable Research Software Metadata

  • 1. Information Sciences Institute TOWARDS KNOWLEDGE GRAPHS OF REUSABLE RESEARCH SOFTWARE METADATA Daniel Garijo, Yolanda Gil, Maximiliano Osrio, Varun Ratnakar, Deborah Khider, Hernan Vargas Information Sciences Institute, University of Southern California @dgarijov dgarijo@isi.edu
  • 2. Information Sciences Institute Is there a reproducibility crisis? [Nature, 2016] Source: https://www.nature.com/news/1-500-scientists-lift-the-lid-on-reproducibility-1.19970
  • 3. Information Sciences Institute Reproducibility in Computational Sciences: Open Research Data, Software and Methods Scientific publication Research Data Research Software Research Methods
  • 4. Information Sciences Institute Challenges for Finding, Understanding, (Re)Using and Sharing Research Software • What does the software component do? Which of its methods should I use? • How to transform my data to use the software component? • How to interpret the results produced by the software component? • How to invoke the software component? • How to configure the software component with the right parameters? • How to compare software with similar software? Software designerSoftware user • How to ease capturing the dependencies and installation instructions of my software? • How to encapsulate my software so it can be used with other data? • How to describe my software so it can be used by others? • How to test if my software is ready to be used by others? • How can my component be found by others
  • 5. Information Sciences Institute How are we addressing these challenges? 1. Describe Research Software in a machine-readable manner 2. Link and connect Research Software in Knowledge Graphs 3. Build applications for helping finding, understanding and reusing Research Software using those Knowledge Graphs
  • 6. Information Sciences Institute 1. Describing Research Software metadata in a machine-readable manner
  • 7. Information Sciences Institute Representing Software Metadata: OntoSoft Crowdsourced Software Metadata Registry • Complements code repositories to make them understandable • Software metadata designed for scientists • Metadata is curated by decentralized communities of users • Training scientists on best practices http://ontosoft.org Finding Software OntoSoft: Capturing scientific software metadata. Gil, Y.; Ratnakar, V.; and Garijo, D. In Proceedings of the 8th International Conference on Knowledge Capture, pages 32, 2015. ACM
  • 8. Information Sciences Institute Adding Structure to Software Metadata: OKG-Soft Explore input/output variables Explore Software I/O files Knowledge Graph with machine-readable Software Metadata: • (From OntoSoft) Attribution, license, funding, usage examples... • Executable software components • Software invocation • Input & output files, variables and units • Containers used to encapsulate and run software components [Garijo et al 2019]: OKG-Soft: An Open Knowledge Graph with Machine Readable Scientific Software Metadata. International Conference on eScience, San Diego, USA. 2019
  • 9. Information Sciences Institute Evolving OntoSoft: Software Description Ontology https://w3id.org/okn/o/sd# Extensions: • Schema.org/Codemeta (software metadata) • W3C Data Cubes (Contents of inputs and outputs) • NASA QUDT (Units) • DockerPedia (Software images) • Scientific Variables Ontology (Standard Variables) 14
  • 10. Information Sciences Institute 1. Describing Research Software Metadata 2. Creating Knowledge Graphs with Research Software Metadata • Automatically
  • 11. Information Sciences Institute Automated Software Metadata Annotation [Mao et al 2019]: SoMEF: A Framework for Capturing Software Metadata from its Documentation. 2019 IEEE BigData REU Symposium. Los Angeles, 2019 whimian/pyGeoPressure SoMEF Description: A Python package for pore pressure prediction... Installation: pip install pygeopressure Invocation: import pygeopressure as ppp Citation: Yu, (2018). PyGeoPressure: Geopressure Prediction in Python. Journal of Open Source Software, 3(30), 992, https://doi.org/10.21105/joss.00992 Software Metadata Extraction Framework Software repository Metadata fields (17 metadata categories): description, installation instructions, invocation, citation, usage notes, requirements, contact, contributors, FAQ, support, license, keywords... https://somef.readthedocs.io/en/latest/ https://github.com/KnowledgeCaptureAndDiscovery/somef
  • 12. Information Sciences Institute SOSEN-KG: integrating Zenodo and GitHub https://github.com/KnowledgeCaptureAndDiscovery/sosen Prototype with > 13K entries of research software metadata • Integrating metadata from Zenodo and GitHub (versions, authors, etc.) • Expanding it with Wikidata (future work)
  • 13. Information Sciences Institute 1. Describing Research Software Metadata 2. Creating Knowledge Graphs with Research Software Metadata • Automatically • Crowdsourcing
  • 14. Information Sciences Institute OKG-SOFT Software Model Catalog contains: • Models from hydrology, agriculture and economy, their versions and model configurations. • More than 200 variables mapped to SVO. • All models are executable through scientific workflows • Most contents are added manually (expert users) collaboratively • Automated unit transformations • Automated software image description • Semi-automated Wikidata linking OKG-Soft: An Open Knowledge Graph with Machine Readable Scientific Software Metadata. Garijo, D.; Osorio, M.; Khider, D.; Ratnakar, V.; and Gil, Y. In 2019 15th International Conference on eScience (eScience), pages 349–358, San Diego, CA, USA, September 2019. IEEE
  • 15. Information Sciences Institute 1. Describing Research Software Metadata 2. Creating Knowledge Graphs with Research Software Metadata • Automatically • Crowdsourcing 3. Using KGs to Find, Understand and Reuse Research Software
  • 16. Information Sciences Institute OntoSoft: Comparing Software Metadata PIHM PIHMgis DrEICH TauDEM WBMsed
  • 17. Information Sciences Institute OKG-SOFT Framework: Exploring Research Software Model Metadata Explore variables of inputs and outputs Explore software I/O Find, compare and configure software models http://models.mint.isi.edu
  • 18. Information Sciences Institute Research Software Reuse: Encapsulating & Testing Machine- readable component specification Assistants + Guidelines TestsTestsTests Portable Component Software Metadata Registry OKG-SOFT https://mic-cli.readthedocs.io/en/latest/ https://dame-cli.readthedocs.io/en/latest/
  • 20. Information Sciences Institute Overcoming the reproducibility crisis (partly) • Research software is a critical asset for reproducible computational experiments • We need to improve the findability, (re)usability and understanding of research software: – Wider adoption – Better comparison of similar computational methods – Better understanding of data products • In this presentation we covered: – How to describe research software and its metadata • OntoSoft, Software Description Ontology – How to build Knowledge Graphs with research software metadata • OntoSoft, OKG-Soft, SOSEN-KG – How we are using KGs to help find, compare, understand and reuse research software
  • 21. Information Sciences Institute Knowledge Capture and Discovery Group Yolanda Gil Varun Ratnakar Daniel Garijo Deborah Khider Maximiliano Osorio Hernan Vargas https://knowledgecaptureanddiscovery.github.io/
  • 22. Information Sciences Institute TOWARDS KNOWLEDGE GRAPHS OF REUSABLE RESEARCH SOFTWARE METADATA Daniel Garijo, Yolanda Gil, Maximiliano Osrio, Varun Ratnakar, Deborah Khider, Hernan Vargas Information Sciences Institute, University of Southern California @dgarijov dgarijo@isi.edu

Editor's Notes

  1. The survey specifies that 3 measures should be taken: better statistics (cherrypicking), mentoring and more robust design.
  2. In our field (computational sciences), the problem narrows down to reusing data, software and methods. There are other aspects like hypothesis, experimental design, etc. But these are the core for reproducibility
  3. Assuming link is provided in a paper
  4. Assuming link is provided in a paper