Successfully reported this slideshow.
Your SlideShare is downloading. ×

Towards Reusable Research Software

Ad

Towards Reusable
Research Software
Daniel Garijo Verdejo
@dgarijov
daniel.garijo@upm.es
Ontology Engineering Group
Departa...

Ad

Reproducibility: Open Research Data, Software and Methods
2
Scientific publication
Research Data Research Software Researc...

Ad

Challenges for (Re)using and Sharing Research Software
3
• What does the software component do?
Which of its methods shoul...

Ad

Ad

Ad

Ad

Ad

Ad

Check these out next

1 of 9 Ad
1 of 9 Ad

Towards Reusable Research Software

Download to read offline



An increasing number of researchers rely on computational methods to generate the results described in their publications. Research software created to this end is heterogeneous (e.g., scripts, libraries, packages, notebooks, etc.) and usually difficult to find, reuse, compare and understand due to its disconnected documentation (dispersed in manuals, readme files, web sites, and code comments) and a lack of structured metadata to describe it. In this talk I will describe the main challenges for finding, comparing and reusing research software, how structured metadata can help to address some of them, which are the best practices being proposed by the community; and current initiatives to aid their adoption by researchers within EOSC.
Impact: The talk addresses an important aspect of the EOSC infrastructure for quality research software by ensuring that software contributed to the EOSC ecosystem can be found, compared and reused by researchers. The talk also aims to address metadata quality of current research products, which is critical for successful adoption.
Presented at the EOSC symposium



An increasing number of researchers rely on computational methods to generate the results described in their publications. Research software created to this end is heterogeneous (e.g., scripts, libraries, packages, notebooks, etc.) and usually difficult to find, reuse, compare and understand due to its disconnected documentation (dispersed in manuals, readme files, web sites, and code comments) and a lack of structured metadata to describe it. In this talk I will describe the main challenges for finding, comparing and reusing research software, how structured metadata can help to address some of them, which are the best practices being proposed by the community; and current initiatives to aid their adoption by researchers within EOSC.
Impact: The talk addresses an important aspect of the EOSC infrastructure for quality research software by ensuring that software contributed to the EOSC ecosystem can be found, compared and reused by researchers. The talk also aims to address metadata quality of current research products, which is critical for successful adoption.
Presented at the EOSC symposium

Advertisement
Advertisement

More Related Content

Slideshows for you (19)

Advertisement

More from dgarijo (20)

Advertisement

Towards Reusable Research Software

  1. 1. Towards Reusable Research Software Daniel Garijo Verdejo @dgarijov daniel.garijo@upm.es Ontology Engineering Group Departamento de Inteligencia Artificial Facultad de Informática Universidad Politécnica de Madrid
  2. 2. Reproducibility: Open Research Data, Software and Methods 2 Scientific publication Research Data Research Software Research Methods EOSC Symposium: Infrastructure for quality research software
  3. 3. Challenges for (Re)using and Sharing Research Software 3 • What does the software component do? Which of its methods should I use? • How to transform my data to use the software component? • How to interpret the results produced by the software component? • How to invoke the software component? • How to configure the software component with the right parameters? • How to compare against similar methods? Software designer Software user • How to ease capturing the dependencies and installation instructions of my software? • How to encapsulate my software so it can be used with other data? • How to describe my software so it can be used by others? • How to test if my software is ready to be used by others? EOSC Symposium: Infrastructure for quality research software
  4. 4. Community Initiatives and Standards • Describing Research Software • Schema.org & Codemeta • Common Worflow Language (I/O) • Packaging Research Artefacts (incl. software) • Research Objects (RO-Crate) • Aggregators (OpenAIRE, EOSC) • General (e.g., Zenodo) & domain-specific registries • Scicodes (https://scicodes.net/) 4 Nine Best Practices for Research Software Registries and Repositories: A Concise Guide https://arxiv.org/abs/2012.13117 EOSC Symposium: Infrastructure for quality research software
  5. 5. Adopting annotation vocabularies: where are we at? Software metadata is not abundant machine readable 5 EOSC Symposium: Infrastructure for quality research software Can you please describe your software component with metadata? I already did! Did you read the project readme? Did you see the online documentation? Perhaps the you saw the paper? Many domain-specific registries are curated by hand by experts
  6. 6. Automated Software Metadata Extraction 6 SOMEF SOftware Metadata Extraction Framework https://github.com/KnowledgeCaptureAndDiscovery/somef/ [Mao et al 2019]: SoMEF: A Framework for Capturing Software Metadata from its Documentation. 2019 IEEE BigData REU Symposium. Los Angeles, 2019 EOSC Symposium: Infrastructure for quality research software Code repository (readme) Machine-readable file with software metadata: • > 20 common metadata fields • Installation instructions, description, invocation command, license, author, citation, requirements, examples, documentation, notebooks, etc. • Analysis of readme and supp. Files (e.g., notebooks, Dockerfiles) • JSON, RDF(graph), Codemeta, RO (in progress)
  7. 7. Leveraging Software Metadata to create Knowledge Graphs 7 Explore input/output variables (interoperability) Explore Software I/O files (composition) Knowledge Graphs with can link RS and its components. OKG-Soft: machine-readable Software Metadata: • (From Schema.org) Attribution, license, funding, usage examples... • Executable software components • Software invocation • Input & output files, variables and units • Containers used to encapsulate and run software components • Parameter validation and suggestion [Garijo et al 2019]: OKG-Soft: An Open Knowledge Graph with Machine Readable Scientific Software Metadata. International Conference on eScience, San Diego, USA. 2019 EOSC Symposium: Infrastructure for quality research software
  8. 8. Conclusions Research Software Metadata should be actionable and useful for: • Understanding the differences between two or more software components • Help portability (ROs) • Add components in workflows (CWL + ROs) • Help linking similar software methods • Build automated comparison benchmarks • Reduce the time needed to understand and adopt an existing software component • Author credit 8 EOSC Symposium: Infrastructure for quality research software
  9. 9. Questions? Let's create machine-actionable software metadata 9 Image credit: https://icons8.com/icons/ + findable portable comparable executable reusable Code + documentation Automated extraction Knowledge Graphs EOSC Symposium: Infrastructure for quality research software Acknowledgements: Yolanda Gil, Deborah Khider, Varun Ratnakar, Maximiliano Osorio, Hernan Vargas, Oscar Corcho SOMEF

×