Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software Metadata

1,499 views

Published on

Scientific software is crucial for understanding, reusing and reproducing results in computational sciences. Software is often stored in code repositories, which may contain human readable instructions necessary to use it and set it up. However, a significant amount of time is usually required to understand how to invoke a software component, prepare data in the format it requires, and use it in combination with other software. In this presentation we introduce OKG-Soft, an open knowledge graph that describes scientific software in a machine readable manner. OKG-Soft includes: 1) an ontology designed to describe software and the specific data formats it uses; 2) an approach to publish software metadata as an open knowledge graph, linked to other Web of Data objects; and 3) a framework to annotate, query, explore and curate scientific software metadata.

Published in: Software
  • Be the first to comment

  • Be the first to like this

OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software Metadata

  1. 1. http://mint-project.info OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA Daniel Garijo, Maximiliano Osorio, Deborah Khider, Varun Ratnakar and Yolanda Gil University of Southern California, Information Sciences Institute @dgarijov OKN-AKBC May 22nd,Amherst, USAInformation Sciences Institute
  2. 2. http://mint-project.info Science is changing: Open Science 2 Open publications Open data Open access Open source software Impact and credit OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019
  3. 3. http://mint-project.info The importance of Scientific Software 3 Open publications Open data Open source software • Software helps understand data • Provenance, reproducibility • Software helps understanding methods • Assumptions, limitations OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019
  4. 4. http://mint-project.info Why is it difficult to reuse Scientific Software? Software A Hydrology Model Weather DEM Infiltration Outflow Error FLDAS (climate) Remote sensing Let’s imagine we want to reuse existing work: Software B Map-based visualizations ? ? ? How do I to transform data? How do I invoke code? How do I interpret results? OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019 4
  5. 5. http://mint-project.info Outline 5 1. Requirements help scientific software reusability 2. Our current approach for representing scientific software metadata 3. A framework to query, explore, exploit and publish software metadata OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019
  6. 6. http://mint-project.info Outline 6 1. Requirements help scientific software reusability 2. Our current approach for representing scientific software metadata 3. A framework to query, explore, exploit and publish software metadata OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019
  7. 7. http://mint-project.info Requirements for Software Reusability 7 1. Exposing software inputs, outputs and their corresponding variables Hydrology Software Model OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019 Weather DEM Infiltration Outflow Error Input1 Input2 Input3 Output1 Output2 - Land surface temperature (degC) - Precipitation rate (mm/h) - Land surface wind speed (m/day) - Net radiation (MJ/(day m^2)
  8. 8. http://mint-project.info Requirements for Software Reusability 8 1. Exposing software inputs, outputs and their corresponding variables 2. Capturing the functions of the software being used OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019 Hydrology Software Model Function A: Richards Equation for water movement (unsat soil) Function B: Saint Venant equations (shallow water)
  9. 9. http://mint-project.info Requirements for Software Reusability 9 1. Exposing software inputs, outputs and their corresponding variables 2. Capturing the functions of software being used 3. Using principled ontologies with structured names for model variables, processes, and methods OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019 Temp T T_C svo:land_surface_ air__temperature
  10. 10. http://mint-project.info Requirements for Software Reusability 10 1. Exposing software inputs, outputs and their corresponding variables 2. Capturing the functions of software being used 3. Using principled ontologies with structured names for model variables, processes, and methods 4. Capture the semantic structure of software invocations OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019 Dependencies? Sample runs? Invocation command? Is data supposed to be in the same folder? Default arguments/Configuration files? Volumes? Do I have to log in in the image
  11. 11. http://mint-project.info Outline 11 1. Requirements help scientific software reusability 2. Our current approach for representing scientific software metadata 3. A framework to query, explore, exploit and publish software metadata OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019
  12. 12. http://mint-project.info Prior Work: OntoSoft Software Metadata Registry 12 OntoSoft Model and Software Metadata Registry • Complements code repositories to make them understandable • Software metadata designed for scientists • Metadata is curated by decentralized communities of users • Training scientists on best practices http://ontosoft.org Finding Software OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019 [Gil et al 2015]: OntoSoft: Capturing Scientific Software Metadata Eighth ACM International Conference on Knowledge Capture, Palisades, NY, 2015
  13. 13. http://mint-project.info Prior Work: OntoSoft Software Metadata Registry 13 OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019 PIHM PIHMgis DrEICH TauDEM WBMsed
  14. 14. http://mint-project.info Evolving OntoSoft: Software Description Ontology https://w3id.org/okn/o/sd# Extensions: • Schema.org (software metadata) • W3C Data Cubes (Contents of inputs and outputs) • NASA QUDT (Units) • DockerPedia (Software images) • Scientific Variables Ontology (Standard Variables) OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019 14
  15. 15. http://mint-project.info Evolving OntoSoft: Extending schema.org and Codemeta https://w3id.org/okn/o/sd# OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019 15
  16. 16. http://mint-project.info https://w3id.org/okn/o/sd# Extensions: • Schema.org (software metadata) • W3C Data Cubes (Contents of inputs and outputs) • NASA QUDT (Units) • DockerPedia (Software images) • Scientific Variables Ontology (Standard Variables) OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019 16 Evolving OntoSoft: Software Description Ontology
  17. 17. http://mint-project.info Describing Input/Output files, parameters and variables 17 M. Stoica and S. D. Peckham, “An Ontology Blueprint for Constructing Qualitative and Quantitative Scientific Variables,” in Proceedings of the ISWC 2018 Posters & Demonstrations, Industry and Blue Sky Ideas Tracks co-located with 17th International Semantic Web Conference (ISWC 2018), Monterey, USA, October 8th - to - 12th, 2018., 2018 Scientific Variables Ontology identifiers
  18. 18. http://mint-project.info https://w3id.org/okn/o/sd# Extensions: • Schema.org (software metadata) • W3C Data Cubes (Contents of inputs and outputs) • NASA QUDT (Units) • DockerPedia (Software images) • Scientific Variables Ontology (Standard Variables) OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019 18 Evolving OntoSoft: Software Description Ontology
  19. 19. http://mint-project.info Machine readable representation of units 19 “RAIN” mm/day CCUT Linking and augmenting from Wikidata • B. Shbita, A. Rajendran, J. Pujara, and C. Knoblock, Parsing, Representing and Transforming Units of Measure, in Modeling the World's Systems, 2019.
  20. 20. http://mint-project.info https://w3id.org/okn/o/sd# Extensions: • Schema.org (software metadata) • W3C Data Cubes (Contents of inputs and outputs) • NASA QUDT (Units) • DockerPedia (Software images) • Scientific Variables Ontology (Standard Variables) OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019 20 Evolving OntoSoft: Describing Containers
  21. 21. http://mint-project.infoOKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019 21 Evolving OntoSoft: Describing Containers “mintproject/ pihm2cycles” Docker pedia Image tag Image semantic representation https://dockerpedia.inf.utfsm.cl/ • M. Osorio, H. Vargas, and C. Buil Aranda, “DockerPedia: a Knowledge Graph of Docker Images,” in Proceedings of the ISWC 2018 Posters & Demonstrations, Industry and Blue Sky Ideas Tracks co-located with 17th International Semantic Web Conference (ISWC 2018), Monterrey, 2018.
  22. 22. http://mint-project.info Outline 22 1. Requirements help scientific software reusability 2. Our current approach for representing scientific software metadata 3. A framework to query, explore, exploit and publish software metadata OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019
  23. 23. http://mint-project.info OKG-SOFT: Framework 23 Software Model Catalog contains: • Models from hydrology, agriculture and economy, their versions and model configurations. • More than 200 variables mapped to SVO. • All models are executable through scientific workflows • Most contents are added manually (expert users) collaboratively • Automated unit transformations • Automated software image description • Semi-automated Wikidata linking OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019 https://query.mint.isi.edu/api/mintproject/MINT-ModelCatalogQueries#/ APIs: • SPARQL endpoint • REST APIs (GET/POST) • Python clients
  24. 24. http://mint-project.info Exploitation: Exploring Scientific Software Model Metadata 24http://models.mint.isi.edu Explore variables OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019 Explore Software I/O Find Software Models
  25. 25. http://mint-project.info Exploitation: Comparing Scientific Software Models 25 http://models.mint.isi.edu OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019
  26. 26. http://mint-project.info Exploitation: Towards Automated Software Composition 26 OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019
  27. 27. http://mint-project.info Summary 27 Scientific software reusability is crucial to understand • Existing data • Published methods 1. Requirements for scientific software reusability include • Expose inputs, outputs, variables and software invocation details! 2. Our approach for capturing and structuring scientific software 3. A framework to query, explore, exploit and publish software metadata OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019
  28. 28. http://mint-project.info Help us making your software more reusable 28 We would like to thank Scott Peckham, Maria Stoica, Chris Duffy, Lele Shu, Kelly Cobourn, Zeya Zhang Suzanne Pierce, Armen Kemanian, Rajiv Mayani, Jay Puajara, Basel Shbita, Dhruv Pattel, Rohit Mayura, Amrish Goel and Anuj Doiphode OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019 Contact me: dgarijo@isi.edu

×