SlideShare a Scribd company logo
1 of 18
Download to read offline
1Yolanda GilUSC Information Sciences Institute gil@isi.edu
OntoSoft: A Distributed Semantic Registry for
Scientific Software
Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar
Information Sciences Institute
and Department of Computer Science
University of Southern California
@yolandagil, @dgarijov
{gil,dgarijo,saurabhm,varunr}@isi.edu
http://www.ontosoft.org
Building Block
Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
2Yolanda GilUSC Information Sciences Institute gil@isi.edu
We have all been here…
Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
3Yolanda GilUSC Information Sciences Institute gil@isi.edu
The Value of Software: Reproducibility
Financial
Human lives
Reliability
Scientific
integrity
Financial
Trust
5/ 29/ 15, 1:49 AMRetracted Scientific Studies: A Growing List - NYTimes.com
Sections Home Search Skip to content
Advertisement
Email
Share
Tweet
More
Search
Subscribe
Log In 0 Settings
Close search
SUBSCRIBE NOW
5/ 29/ 15, 1:49 AM
a study of changing attitudes about gay marriage is
wal of research results from scientific literature.
e last. A 2011 study in Nature found a 10-fold
during the preceding decade.
ster outside of the scientific field. But in some
ere clawed back made major waves in societal
dealt with. This list recounts some prominent
d since 1980.
Photo
h medical journal,
rew Wakefield
children was
ine for measles,
The Lancet
a review of Dr.
ds and financial
dy, Dr.
rong effect on
d
Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
4Yolanda GilUSC Information Sciences Institute gil@isi.edu
Quantifying the Value of Software through
“Reproducibility Maps” [Bourne & Gil et al 12]
 2 months of effort in reproducing published method (in PLoS’10)
 Authors expertise was required
Comparison of
ligand binding
sites
Comparison of dissimilar
protein structures
Graph network
generation
Molecular Docking
Work with P. Bourne of UCSD
5Yolanda GilUSC Information Sciences Institute gil@isi.edu
Software Today
 There are repositories of domain specific software (e.g.,
geosciences)
 There are general software repositories with no standard
metadata
 Most scientists are not aware of the value of their software
Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
6Yolanda GilUSC Information Sciences Institute gil@isi.edu
“Dark Software”
 Models that are not
published
• Eg from a PhD thesis
 Data preparation
software
• Data pre-processing and
QC can take up to 80% of a
project’s effort
 Visualization software
“Dark Software” is the counterpart of “Dark Data” [Heidorn 2008]
Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
7Yolanda GilUSC Information Sciences Institute gil@isi.edu
Why Is Software Not Shared?
 “Noone would use my code if I shared it”
 “My code is really bad”
 “My code is not ready to be shared”
 “Sharing my software will take a lot of time”
 “I won’t get anything out of sharing my software”
 “I’ve shared software before, bad things happened”
 “I work for the government”
 “I want to commercialize my software”
 “I don’t want anyone to sell my software”
 “I don’t know where to start!”
Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
8Yolanda GilUSC Information Sciences Institute gil@isi.edu
Contributions: OntoSoft
Registry for software
• Complements code repositories
• Scientist-centered software metadata
• Community curated software metadata
• Training scientists on best practices
Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
9Yolanda GilUSC Information Sciences Institute gil@isi.edu
OntoSoft Architecture
OntoSo
So ware
Metadata
Repository
Ontologies
Geo
sciences
OntoSo
so ware
metadataimportpublish
query
OntoSo User Interface
Publish
Browse/
Search
query
External
Repository
Push
GitHub
Apache
SVN
CSDMS
…
Adapters
(eg, BMI)
CSDMS CF ESMF …
Domain-Specific UI
Standard
Names
OntoSo Training
Lessons
VM
Environment
Generator
Docker
Vagrant
…
…
Solr
Search
Index
Videos
Domain
Ontologies
…
External
Repository
Pull
5/31/2016
Recommend
NOAA
…
OntoSo components
External components
Legend
Other OntoSo
Installa ons
PROV
Web
Access
Control
Metadata
Access
Control
10Yolanda GilUSC Information Sciences Institute gil@isi.edu
The OntoSoft Ontology for Describing
Scientific Software Metadata [Gil et al 2015]
 An ontology for scientific software metadata
• Intended to describe scientific software
• Designed with scientists in mind to guide them to deposit and
describe their software in a software registry
 Major categories of metadata: what does a scientist need?
1. identify software
2. understand what it does and its utility for research,
3. execute the software,
4. get support if questions arise,
5. do research with it, and
6. contribute to its development
Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
11Yolanda GilUSC Information Sciences Institute gil@isi.edu
OntoSoft Metadata Categories
http://www.ontosoft.org/software
Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
12Yolanda GilUSC Information Sciences Institute gil@isi.edu
Describing Scientific Software in OntoSoft
http://www.ontosoft.org/portal
Metadata can be exported in
several formats (HTML, RDF,
JSON)
Metadata for 3DDY Software
Metadata properties
collected through
simple questions
Set permissions for 3DDY
Metadata properties
organized into categories that
make sense to scientists
Automatic import of metadata
from other repositories
Indicators of metadata
completeness
13Yolanda GilUSC Information Sciences Institute gil@isi.edu
Access control
http://www.ontosoft.org/portal
Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
Users and permissions for
the 3DDY software
component
Setting permissions for
editing 3DDY metadata
W3CWeb access control Ontology
14Yolanda GilUSC Information Sciences Institute gil@isi.edu
Software entries
from distributed
repositories are
readily accessible
Semantic
search
Comparison matrix
of software entries
PIHM PIHMgis DrEICH TauDEM WBMsed
nto$
o%$
Metadata
completion
highlighted
Software is
contrasted
by property
15Yolanda GilUSC Information Sciences Institute gil@isi.edu
Community
Learning
UK Software
Institute
Software
Carpentry
CIG
ESMF
Critical Zone
Observatory
Early Career
Advisory Board
FES/
ESIP
CSDMS
EarthCube
Building Blocks
Recommender system
Ÿ Interoperability
Publication
Community
Learning
Structured metadata
Ÿ Interactive advice
Ÿ Best practices
Ÿ Multimedia lessons
Collaborating with SEN C4P EC3
EarthCube
RCNs
Publication
Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
Omics
Code meta initiative
16Yolanda GilUSC Information Sciences Institute gil@isi.edu
Conclusions
Software is a valuable
research product
• Must embed best practices of
software sharing into
research activities
Improve productivity,
quality, reproducibility
OntoSoft contributions
• Ontology of scientific
software metadata
• Portal for software registry Do you want to use Ontosoft?
Let us know!
http://www.ontosoft.org
http://www.ontosoft.org/software
http://www.ontosoft.org/portal
Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
17Yolanda GilUSC Information Sciences Institute gil@isi.edu
More Information
http://www.ontosoft.org
http://www.ontosoft.org/software
http://www.ontosoft.org/portal
http://www.ontosoft.org/gpf
 OntoSoft: Capturing Scientific Software Metadata. Yolanda Gil, Varun Ratnakar, and Daniel Garijo. Proceedings of the
Eighth ACM International Conference on Knowledge Capture (K-CAP), 2015.
 OntoSoft: A Distributed Semantic Registry for Scientific Software. Yolanda Gil, Daniel Garijo, Saurabh Mishra, and
Varun Ratnakar. Under review, 2016.
 DRAT: An Unobtrusive, Scalable Approach to Large Scale Software License Analysis. Chris A. Mattmann, Ji-Hyun Oh,
Tyler Palsulich, Lewis John McGibbney, Yolanda Gil, and Varun Ratnakar. Proceedings of the Fourth International Workshop
on Software Mining, held in conjunction with the 30th IEEE/ACM International Conference on Automated Software Engineering
(ASE), 2015.
 Cyber-Innovated Watershed Research at the Shale Hills Critical Zone Observatory. Xuan Yu, Chris Duffy, Yolanda Gil,
Lorne Leonard, Gopal Bhatt, and Evan Thomas. IEEE Systems Journal, to appear.
 Collaborative Software Development Needs in Geosciences. Yolanda Gil, Eunyoung Moon and James Howison.
Proceedings of the Second Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE2), held in conjunction
with the IEEE ACM International Conference on High Performance Computing (SC), New Orleans, LA, November 2014.
 Workflow Reuse in Practice: A Study of Neuroimaging Pipeline Users. Daniel Garijo, Oscar Corcho, Yolanda Gil,
Meredith N. Braskie, Derrek Hibar, Xue Hua, Neda Jahanshad and, Paul Thompson and Arthur W. Toga. Proceedings of
the IEEE Conference on e-Science, 2014.
 FragFlow: Automated Fragment Detection in Scientific Workflows. Daniel Garijo, Oscar Corcho, Yolanda Gil, Boris A.
Gutman, Ivo D. Dinov, Paul Thompson and Arthur W. Toga. Proceedings of the IEEE Conference on e-Science, Guarujua,
Brazil, October 2014.
 An Overview of Mobile Applications for Field Science. Anna Zeng, Kevin Zeng, Yolanda Gil, and Matty Mookerjee.
GeoSoft Project Report, September 2014.
 The CSDMS Standard Names: Cross-Domain Naming Conventions for Describing Process Models, Data Sets and
Their Associated Variables. Scott D. Peckham. Proceedings of the Seventh International Congress on Environmental Modeling
and Software, San Diego, CA, June 2014.
 Web Applications that Share Level-12 HUC Data and Models of the CONUS. Lorne Leonard and Chris Duffy.
Proceedings of the Seventh International Congress on Environmental Modeling and Software, San Diego, CA, June 2014.
 Intelligent Workflow Systems and Provenance-Aware Software. Yolanda Gil. Proceedings of the Seventh International
Congress on Environmental Modeling and Software, San Diego, CA, June 2014.
Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
18Yolanda GilUSC Information Sciences Institute gil@isi.edu
Acknowledgements
 The OntoSoft project team includes Chris Duffy (PSU), Chris Mattmann (JPL),
Scott Pechkam (CU), Ji-Hyun Oh (USC), Varun Ratnakar (USC), and Erin
Robinson (ESIP)
 Thank you to James Howison (UT), Lisa Kempler (Matworks), and Greg Wilson
(Software Carpentry) for their feedback on best practices for software sharing
 Thank you to the scientists and other colleagues that have contributed ideas
and asked hard questions about software stewardship
 Thank you to the National Science Foundation and the EarthCube program for
supporting this work
EarthCube!
ICER-1440323
ICER-1343800
http://www.ontosoft.org
http://www.ontosoft.org/software
http://www.ontosoft.org/portal
http://www.ontosoft.org/gpf
Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016

More Related Content

Viewers also liked

Automated Hypothesis Testing with Large Scale Scientific Workflows
Automated Hypothesis Testing with Large Scale Scientific WorkflowsAutomated Hypothesis Testing with Large Scale Scientific Workflows
Automated Hypothesis Testing with Large Scale Scientific Workflowsdgarijo
 
Software Metadata: Describing "dark software" in GeoSciences
Software Metadata: Describing "dark software" in GeoSciencesSoftware Metadata: Describing "dark software" in GeoSciences
Software Metadata: Describing "dark software" in GeoSciencesdgarijo
 
Towards Workflow Ecosystems Through Semantic and Standard Representations
Towards Workflow Ecosystems Through Semantic and Standard RepresentationsTowards Workflow Ecosystems Through Semantic and Standard Representations
Towards Workflow Ecosystems Through Semantic and Standard Representationsdgarijo
 
Common Motifs in Scientific Workflows: An Empirical Analysis
Common Motifs in Scientific Workflows: An Empirical AnalysisCommon Motifs in Scientific Workflows: An Empirical Analysis
Common Motifs in Scientific Workflows: An Empirical Analysisdgarijo
 
LDP4j: A framework for the development of interoperable read-write Linked Da...
LDP4j: A framework for the development of interoperable read-write Linked Da...LDP4j: A framework for the development of interoperable read-write Linked Da...
LDP4j: A framework for the development of interoperable read-write Linked Da...Nandana Mihindukulasooriya
 
Why do they call it Linked Data when they want to say...?
Why do they call it Linked Data when they want to say...?Why do they call it Linked Data when they want to say...?
Why do they call it Linked Data when they want to say...?Oscar Corcho
 
Towards Automating Data Narratives
Towards Automating Data NarrativesTowards Automating Data Narratives
Towards Automating Data Narrativesdgarijo
 

Viewers also liked (7)

Automated Hypothesis Testing with Large Scale Scientific Workflows
Automated Hypothesis Testing with Large Scale Scientific WorkflowsAutomated Hypothesis Testing with Large Scale Scientific Workflows
Automated Hypothesis Testing with Large Scale Scientific Workflows
 
Software Metadata: Describing "dark software" in GeoSciences
Software Metadata: Describing "dark software" in GeoSciencesSoftware Metadata: Describing "dark software" in GeoSciences
Software Metadata: Describing "dark software" in GeoSciences
 
Towards Workflow Ecosystems Through Semantic and Standard Representations
Towards Workflow Ecosystems Through Semantic and Standard RepresentationsTowards Workflow Ecosystems Through Semantic and Standard Representations
Towards Workflow Ecosystems Through Semantic and Standard Representations
 
Common Motifs in Scientific Workflows: An Empirical Analysis
Common Motifs in Scientific Workflows: An Empirical AnalysisCommon Motifs in Scientific Workflows: An Empirical Analysis
Common Motifs in Scientific Workflows: An Empirical Analysis
 
LDP4j: A framework for the development of interoperable read-write Linked Da...
LDP4j: A framework for the development of interoperable read-write Linked Da...LDP4j: A framework for the development of interoperable read-write Linked Da...
LDP4j: A framework for the development of interoperable read-write Linked Da...
 
Why do they call it Linked Data when they want to say...?
Why do they call it Linked Data when they want to say...?Why do they call it Linked Data when they want to say...?
Why do they call it Linked Data when they want to say...?
 
Towards Automating Data Narratives
Towards Automating Data NarrativesTowards Automating Data Narratives
Towards Automating Data Narratives
 

Similar to OntoSoft: A Distributed Semantic Registry for Scientific Software

Towards Knowledge Graphs of Reusable Research Software Metadata
Towards Knowledge Graphs of Reusable Research Software MetadataTowards Knowledge Graphs of Reusable Research Software Metadata
Towards Knowledge Graphs of Reusable Research Software Metadatadgarijo
 
Software Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceSoftware Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceCarole Goble
 
Better Software, Better Research
Better Software, Better ResearchBetter Software, Better Research
Better Software, Better ResearchCarole Goble
 
PhD Defense Øyvind Hauge
PhD Defense Øyvind HaugePhD Defense Øyvind Hauge
PhD Defense Øyvind HaugeØyvind Hauge
 
Gridforum David De Roure Newe Science 20080402
Gridforum David De Roure Newe Science 20080402Gridforum David De Roure Newe Science 20080402
Gridforum David De Roure Newe Science 20080402vrij
 
8 better ways of doing your engineering project
8 better ways of doing your engineering project8 better ways of doing your engineering project
8 better ways of doing your engineering projecttalkingkarthik
 
RDA BoF on Sustainability - my experience with ISA tools
RDA BoF on Sustainability - my experience with ISA toolsRDA BoF on Sustainability - my experience with ISA tools
RDA BoF on Sustainability - my experience with ISA toolsSusanna-Assunta Sansone
 
Code for science (rev 2)
Code for science (rev 2)Code for science (rev 2)
Code for science (rev 2)Andy Lenards
 
Acting as Advocate? Seven steps for libraries in the data decade
Acting as Advocate? Seven steps for libraries in the data decadeActing as Advocate? Seven steps for libraries in the data decade
Acting as Advocate? Seven steps for libraries in the data decadeLizLyon
 
Usability, Reusability and Reproducibility of Bioinformatic Applications
 Usability, Reusability and Reproducibility of Bioinformatic Applications  Usability, Reusability and Reproducibility of Bioinformatic Applications
Usability, Reusability and Reproducibility of Bioinformatic Applications Sandra Gesing
 
Learning Open Source through GSOC
Learning Open Source through GSOC Learning Open Source through GSOC
Learning Open Source through GSOC smarru
 
A New Paradigm on Analytic-Driven Information and Automation V2.pdf
A New Paradigm on Analytic-Driven Information and Automation V2.pdfA New Paradigm on Analytic-Driven Information and Automation V2.pdf
A New Paradigm on Analytic-Driven Information and Automation V2.pdfArmyTrilidiaDevegaSK
 
The New e-Science (Bangalore Edition)
The New e-Science (Bangalore Edition)The New e-Science (Bangalore Edition)
The New e-Science (Bangalore Edition)David De Roure
 
Spark Summit Europe: Share and analyse genomic data at scale
Spark Summit Europe: Share and analyse genomic data at scaleSpark Summit Europe: Share and analyse genomic data at scale
Spark Summit Europe: Share and analyse genomic data at scaleAndy Petrella
 
ACS Summer Institute - Emerging Roles of Librarians - 14_0731
ACS Summer Institute - Emerging Roles of Librarians - 14_0731ACS Summer Institute - Emerging Roles of Librarians - 14_0731
ACS Summer Institute - Emerging Roles of Librarians - 14_0731jeffreylancaster
 
Better software, better service, better research: The Software Sustainabilit...
Better software, better service, better research: The Software Sustainabilit...Better software, better service, better research: The Software Sustainabilit...
Better software, better service, better research: The Software Sustainabilit...Carole Goble
 
myExperiment - Defining the Social Virtual Research Environment
myExperiment - Defining the Social Virtual Research EnvironmentmyExperiment - Defining the Social Virtual Research Environment
myExperiment - Defining the Social Virtual Research EnvironmentDavid De Roure
 
Doing Science Properly In The Digital Age - Rutgers Seminar
Doing Science Properly In The Digital Age - Rutgers SeminarDoing Science Properly In The Digital Age - Rutgers Seminar
Doing Science Properly In The Digital Age - Rutgers SeminarNeil Chue Hong
 
Introduction to Biological Network Analysis and Visualization with Cytoscape ...
Introduction to Biological Network Analysis and Visualization with Cytoscape ...Introduction to Biological Network Analysis and Visualization with Cytoscape ...
Introduction to Biological Network Analysis and Visualization with Cytoscape ...Keiichiro Ono
 
Interactive and collaborative AI for biodiversity monitoring and beyond - JWK...
Interactive and collaborative AI for biodiversity monitoring and beyond - JWK...Interactive and collaborative AI for biodiversity monitoring and beyond - JWK...
Interactive and collaborative AI for biodiversity monitoring and beyond - JWK...SURFevents
 

Similar to OntoSoft: A Distributed Semantic Registry for Scientific Software (20)

Towards Knowledge Graphs of Reusable Research Software Metadata
Towards Knowledge Graphs of Reusable Research Software MetadataTowards Knowledge Graphs of Reusable Research Software Metadata
Towards Knowledge Graphs of Reusable Research Software Metadata
 
Software Sustainability: Better Software Better Science
Software Sustainability: Better Software Better ScienceSoftware Sustainability: Better Software Better Science
Software Sustainability: Better Software Better Science
 
Better Software, Better Research
Better Software, Better ResearchBetter Software, Better Research
Better Software, Better Research
 
PhD Defense Øyvind Hauge
PhD Defense Øyvind HaugePhD Defense Øyvind Hauge
PhD Defense Øyvind Hauge
 
Gridforum David De Roure Newe Science 20080402
Gridforum David De Roure Newe Science 20080402Gridforum David De Roure Newe Science 20080402
Gridforum David De Roure Newe Science 20080402
 
8 better ways of doing your engineering project
8 better ways of doing your engineering project8 better ways of doing your engineering project
8 better ways of doing your engineering project
 
RDA BoF on Sustainability - my experience with ISA tools
RDA BoF on Sustainability - my experience with ISA toolsRDA BoF on Sustainability - my experience with ISA tools
RDA BoF on Sustainability - my experience with ISA tools
 
Code for science (rev 2)
Code for science (rev 2)Code for science (rev 2)
Code for science (rev 2)
 
Acting as Advocate? Seven steps for libraries in the data decade
Acting as Advocate? Seven steps for libraries in the data decadeActing as Advocate? Seven steps for libraries in the data decade
Acting as Advocate? Seven steps for libraries in the data decade
 
Usability, Reusability and Reproducibility of Bioinformatic Applications
 Usability, Reusability and Reproducibility of Bioinformatic Applications  Usability, Reusability and Reproducibility of Bioinformatic Applications
Usability, Reusability and Reproducibility of Bioinformatic Applications
 
Learning Open Source through GSOC
Learning Open Source through GSOC Learning Open Source through GSOC
Learning Open Source through GSOC
 
A New Paradigm on Analytic-Driven Information and Automation V2.pdf
A New Paradigm on Analytic-Driven Information and Automation V2.pdfA New Paradigm on Analytic-Driven Information and Automation V2.pdf
A New Paradigm on Analytic-Driven Information and Automation V2.pdf
 
The New e-Science (Bangalore Edition)
The New e-Science (Bangalore Edition)The New e-Science (Bangalore Edition)
The New e-Science (Bangalore Edition)
 
Spark Summit Europe: Share and analyse genomic data at scale
Spark Summit Europe: Share and analyse genomic data at scaleSpark Summit Europe: Share and analyse genomic data at scale
Spark Summit Europe: Share and analyse genomic data at scale
 
ACS Summer Institute - Emerging Roles of Librarians - 14_0731
ACS Summer Institute - Emerging Roles of Librarians - 14_0731ACS Summer Institute - Emerging Roles of Librarians - 14_0731
ACS Summer Institute - Emerging Roles of Librarians - 14_0731
 
Better software, better service, better research: The Software Sustainabilit...
Better software, better service, better research: The Software Sustainabilit...Better software, better service, better research: The Software Sustainabilit...
Better software, better service, better research: The Software Sustainabilit...
 
myExperiment - Defining the Social Virtual Research Environment
myExperiment - Defining the Social Virtual Research EnvironmentmyExperiment - Defining the Social Virtual Research Environment
myExperiment - Defining the Social Virtual Research Environment
 
Doing Science Properly In The Digital Age - Rutgers Seminar
Doing Science Properly In The Digital Age - Rutgers SeminarDoing Science Properly In The Digital Age - Rutgers Seminar
Doing Science Properly In The Digital Age - Rutgers Seminar
 
Introduction to Biological Network Analysis and Visualization with Cytoscape ...
Introduction to Biological Network Analysis and Visualization with Cytoscape ...Introduction to Biological Network Analysis and Visualization with Cytoscape ...
Introduction to Biological Network Analysis and Visualization with Cytoscape ...
 
Interactive and collaborative AI for biodiversity monitoring and beyond - JWK...
Interactive and collaborative AI for biodiversity monitoring and beyond - JWK...Interactive and collaborative AI for biodiversity monitoring and beyond - JWK...
Interactive and collaborative AI for biodiversity monitoring and beyond - JWK...
 

More from dgarijo

FOOPS!: An Ontology Pitfall Scanner for the FAIR principles
FOOPS!: An Ontology Pitfall Scanner for the FAIR principlesFOOPS!: An Ontology Pitfall Scanner for the FAIR principles
FOOPS!: An Ontology Pitfall Scanner for the FAIR principlesdgarijo
 
FAIR Workflows: A step closer to the Scientific Paper of the Future
FAIR Workflows: A step closer to the Scientific Paper of the FutureFAIR Workflows: A step closer to the Scientific Paper of the Future
FAIR Workflows: A step closer to the Scientific Paper of the Futuredgarijo
 
Towards Reusable Research Software
Towards Reusable Research SoftwareTowards Reusable Research Software
Towards Reusable Research Softwaredgarijo
 
SOMEF: a metadata extraction framework from software documentation
SOMEF: a metadata extraction framework from software documentationSOMEF: a metadata extraction framework from software documentation
SOMEF: a metadata extraction framework from software documentationdgarijo
 
A Template-Based Approach for Annotating Long-Tailed Datasets
A Template-Based Approach for Annotating Long-Tailed DatasetsA Template-Based Approach for Annotating Long-Tailed Datasets
A Template-Based Approach for Annotating Long-Tailed Datasetsdgarijo
 
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge GraphsOBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphsdgarijo
 
Scientific Software Registry Collaboration Workshop: From Software Metadata r...
Scientific Software Registry Collaboration Workshop: From Software Metadata r...Scientific Software Registry Collaboration Workshop: From Software Metadata r...
Scientific Software Registry Collaboration Workshop: From Software Metadata r...dgarijo
 
WDPlus: Leveraging Wikidata to Link and Extend Tabular Data
WDPlus: Leveraging Wikidata to Link and Extend Tabular DataWDPlus: Leveraging Wikidata to Link and Extend Tabular Data
WDPlus: Leveraging Wikidata to Link and Extend Tabular Datadgarijo
 
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...dgarijo
 
Towards Human-Guided Machine Learning - IUI 2019
Towards Human-Guided Machine Learning - IUI 2019Towards Human-Guided Machine Learning - IUI 2019
Towards Human-Guided Machine Learning - IUI 2019dgarijo
 
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven ScienceCapturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven Sciencedgarijo
 
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...dgarijo
 
WIDOCO: A Wizard for Documenting Ontologies
WIDOCO: A Wizard for Documenting OntologiesWIDOCO: A Wizard for Documenting Ontologies
WIDOCO: A Wizard for Documenting Ontologiesdgarijo
 
OEG tools for supporting Ontology Engineering
OEG tools for supporting Ontology EngineeringOEG tools for supporting Ontology Engineering
OEG tools for supporting Ontology Engineeringdgarijo
 
Reproducibility Using Semantics: An Overview
Reproducibility Using Semantics: An OverviewReproducibility Using Semantics: An Overview
Reproducibility Using Semantics: An Overviewdgarijo
 
PhD Thesis: Mining abstractions in scientific workflows
PhD Thesis: Mining abstractions in scientific workflowsPhD Thesis: Mining abstractions in scientific workflows
PhD Thesis: Mining abstractions in scientific workflowsdgarijo
 
Publicación de datos y métodos científicos en investigación
Publicación de datos y métodos científicos en investigaciónPublicación de datos y métodos científicos en investigación
Publicación de datos y métodos científicos en investigacióndgarijo
 
EDBT 2015: Summer School Overview
EDBT 2015: Summer School OverviewEDBT 2015: Summer School Overview
EDBT 2015: Summer School Overviewdgarijo
 
Similarity in Wikipedia Articles (EDBT Summer School)
Similarity in Wikipedia Articles (EDBT Summer School)Similarity in Wikipedia Articles (EDBT Summer School)
Similarity in Wikipedia Articles (EDBT Summer School)dgarijo
 
Semantic web 101: Benefits for geologists
Semantic web 101: Benefits for geologistsSemantic web 101: Benefits for geologists
Semantic web 101: Benefits for geologistsdgarijo
 

More from dgarijo (20)

FOOPS!: An Ontology Pitfall Scanner for the FAIR principles
FOOPS!: An Ontology Pitfall Scanner for the FAIR principlesFOOPS!: An Ontology Pitfall Scanner for the FAIR principles
FOOPS!: An Ontology Pitfall Scanner for the FAIR principles
 
FAIR Workflows: A step closer to the Scientific Paper of the Future
FAIR Workflows: A step closer to the Scientific Paper of the FutureFAIR Workflows: A step closer to the Scientific Paper of the Future
FAIR Workflows: A step closer to the Scientific Paper of the Future
 
Towards Reusable Research Software
Towards Reusable Research SoftwareTowards Reusable Research Software
Towards Reusable Research Software
 
SOMEF: a metadata extraction framework from software documentation
SOMEF: a metadata extraction framework from software documentationSOMEF: a metadata extraction framework from software documentation
SOMEF: a metadata extraction framework from software documentation
 
A Template-Based Approach for Annotating Long-Tailed Datasets
A Template-Based Approach for Annotating Long-Tailed DatasetsA Template-Based Approach for Annotating Long-Tailed Datasets
A Template-Based Approach for Annotating Long-Tailed Datasets
 
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge GraphsOBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
 
Scientific Software Registry Collaboration Workshop: From Software Metadata r...
Scientific Software Registry Collaboration Workshop: From Software Metadata r...Scientific Software Registry Collaboration Workshop: From Software Metadata r...
Scientific Software Registry Collaboration Workshop: From Software Metadata r...
 
WDPlus: Leveraging Wikidata to Link and Extend Tabular Data
WDPlus: Leveraging Wikidata to Link and Extend Tabular DataWDPlus: Leveraging Wikidata to Link and Extend Tabular Data
WDPlus: Leveraging Wikidata to Link and Extend Tabular Data
 
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...
 
Towards Human-Guided Machine Learning - IUI 2019
Towards Human-Guided Machine Learning - IUI 2019Towards Human-Guided Machine Learning - IUI 2019
Towards Human-Guided Machine Learning - IUI 2019
 
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven ScienceCapturing Context in Scientific Experiments: Towards Computer-Driven Science
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
 
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...
 
WIDOCO: A Wizard for Documenting Ontologies
WIDOCO: A Wizard for Documenting OntologiesWIDOCO: A Wizard for Documenting Ontologies
WIDOCO: A Wizard for Documenting Ontologies
 
OEG tools for supporting Ontology Engineering
OEG tools for supporting Ontology EngineeringOEG tools for supporting Ontology Engineering
OEG tools for supporting Ontology Engineering
 
Reproducibility Using Semantics: An Overview
Reproducibility Using Semantics: An OverviewReproducibility Using Semantics: An Overview
Reproducibility Using Semantics: An Overview
 
PhD Thesis: Mining abstractions in scientific workflows
PhD Thesis: Mining abstractions in scientific workflowsPhD Thesis: Mining abstractions in scientific workflows
PhD Thesis: Mining abstractions in scientific workflows
 
Publicación de datos y métodos científicos en investigación
Publicación de datos y métodos científicos en investigaciónPublicación de datos y métodos científicos en investigación
Publicación de datos y métodos científicos en investigación
 
EDBT 2015: Summer School Overview
EDBT 2015: Summer School OverviewEDBT 2015: Summer School Overview
EDBT 2015: Summer School Overview
 
Similarity in Wikipedia Articles (EDBT Summer School)
Similarity in Wikipedia Articles (EDBT Summer School)Similarity in Wikipedia Articles (EDBT Summer School)
Similarity in Wikipedia Articles (EDBT Summer School)
 
Semantic web 101: Benefits for geologists
Semantic web 101: Benefits for geologistsSemantic web 101: Benefits for geologists
Semantic web 101: Benefits for geologists
 

Recently uploaded

Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxRTS corp
 
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics
 
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonLeveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonApplitools
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfDrew Moseley
 
Patterns for automating API delivery. API conference
Patterns for automating API delivery. API conferencePatterns for automating API delivery. API conference
Patterns for automating API delivery. API conferencessuser9e7c64
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf31events.com
 
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfExploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfkalichargn70th171
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsChristian Birchler
 
Not a Kubernetes fan? The state of PaaS in 2024
Not a Kubernetes fan? The state of PaaS in 2024Not a Kubernetes fan? The state of PaaS in 2024
Not a Kubernetes fan? The state of PaaS in 2024Anthony Dahanne
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Rob Geurden
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 
Effectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorEffectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorTier1 app
 
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingOpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingShane Coughlan
 
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...OnePlan Solutions
 
Best Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITBest Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITmanoharjgpsolutions
 
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full RecordingOpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full RecordingShane Coughlan
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...OnePlan Solutions
 
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxThe Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxRTS corp
 
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...OnePlan Solutions
 

Recently uploaded (20)

Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
 
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News UpdateVictoriaMetrics Q1 Meet Up '24 - Community & News Update
VictoriaMetrics Q1 Meet Up '24 - Community & News Update
 
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonLeveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdf
 
Patterns for automating API delivery. API conference
Patterns for automating API delivery. API conferencePatterns for automating API delivery. API conference
Patterns for automating API delivery. API conference
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf
 
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfExploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
 
Not a Kubernetes fan? The state of PaaS in 2024
Not a Kubernetes fan? The state of PaaS in 2024Not a Kubernetes fan? The state of PaaS in 2024
Not a Kubernetes fan? The state of PaaS in 2024
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 
Effectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorEffectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryError
 
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingOpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
 
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
 
Best Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh ITBest Angular 17 Classroom & Online training - Naresh IT
Best Angular 17 Classroom & Online training - Naresh IT
 
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full RecordingOpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
OpenChain AI Study Group - Europe and Asia Recap - 2024-04-11 - Full Recording
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
 
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptxThe Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
The Role of IoT and Sensor Technology in Cargo Cloud Solutions.pptx
 
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
Revolutionizing the Digital Transformation Office - Leveraging OnePlan’s AI a...
 

OntoSoft: A Distributed Semantic Registry for Scientific Software

  • 1. 1Yolanda GilUSC Information Sciences Institute gil@isi.edu OntoSoft: A Distributed Semantic Registry for Scientific Software Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar Information Sciences Institute and Department of Computer Science University of Southern California @yolandagil, @dgarijov {gil,dgarijo,saurabhm,varunr}@isi.edu http://www.ontosoft.org Building Block Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
  • 2. 2Yolanda GilUSC Information Sciences Institute gil@isi.edu We have all been here… Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
  • 3. 3Yolanda GilUSC Information Sciences Institute gil@isi.edu The Value of Software: Reproducibility Financial Human lives Reliability Scientific integrity Financial Trust 5/ 29/ 15, 1:49 AMRetracted Scientific Studies: A Growing List - NYTimes.com Sections Home Search Skip to content Advertisement Email Share Tweet More Search Subscribe Log In 0 Settings Close search SUBSCRIBE NOW 5/ 29/ 15, 1:49 AM a study of changing attitudes about gay marriage is wal of research results from scientific literature. e last. A 2011 study in Nature found a 10-fold during the preceding decade. ster outside of the scientific field. But in some ere clawed back made major waves in societal dealt with. This list recounts some prominent d since 1980. Photo h medical journal, rew Wakefield children was ine for measles, The Lancet a review of Dr. ds and financial dy, Dr. rong effect on d Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
  • 4. 4Yolanda GilUSC Information Sciences Institute gil@isi.edu Quantifying the Value of Software through “Reproducibility Maps” [Bourne & Gil et al 12]  2 months of effort in reproducing published method (in PLoS’10)  Authors expertise was required Comparison of ligand binding sites Comparison of dissimilar protein structures Graph network generation Molecular Docking Work with P. Bourne of UCSD
  • 5. 5Yolanda GilUSC Information Sciences Institute gil@isi.edu Software Today  There are repositories of domain specific software (e.g., geosciences)  There are general software repositories with no standard metadata  Most scientists are not aware of the value of their software Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
  • 6. 6Yolanda GilUSC Information Sciences Institute gil@isi.edu “Dark Software”  Models that are not published • Eg from a PhD thesis  Data preparation software • Data pre-processing and QC can take up to 80% of a project’s effort  Visualization software “Dark Software” is the counterpart of “Dark Data” [Heidorn 2008] Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
  • 7. 7Yolanda GilUSC Information Sciences Institute gil@isi.edu Why Is Software Not Shared?  “Noone would use my code if I shared it”  “My code is really bad”  “My code is not ready to be shared”  “Sharing my software will take a lot of time”  “I won’t get anything out of sharing my software”  “I’ve shared software before, bad things happened”  “I work for the government”  “I want to commercialize my software”  “I don’t want anyone to sell my software”  “I don’t know where to start!” Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
  • 8. 8Yolanda GilUSC Information Sciences Institute gil@isi.edu Contributions: OntoSoft Registry for software • Complements code repositories • Scientist-centered software metadata • Community curated software metadata • Training scientists on best practices Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
  • 9. 9Yolanda GilUSC Information Sciences Institute gil@isi.edu OntoSoft Architecture OntoSo So ware Metadata Repository Ontologies Geo sciences OntoSo so ware metadataimportpublish query OntoSo User Interface Publish Browse/ Search query External Repository Push GitHub Apache SVN CSDMS … Adapters (eg, BMI) CSDMS CF ESMF … Domain-Specific UI Standard Names OntoSo Training Lessons VM Environment Generator Docker Vagrant … … Solr Search Index Videos Domain Ontologies … External Repository Pull 5/31/2016 Recommend NOAA … OntoSo components External components Legend Other OntoSo Installa ons PROV Web Access Control Metadata Access Control
  • 10. 10Yolanda GilUSC Information Sciences Institute gil@isi.edu The OntoSoft Ontology for Describing Scientific Software Metadata [Gil et al 2015]  An ontology for scientific software metadata • Intended to describe scientific software • Designed with scientists in mind to guide them to deposit and describe their software in a software registry  Major categories of metadata: what does a scientist need? 1. identify software 2. understand what it does and its utility for research, 3. execute the software, 4. get support if questions arise, 5. do research with it, and 6. contribute to its development Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
  • 11. 11Yolanda GilUSC Information Sciences Institute gil@isi.edu OntoSoft Metadata Categories http://www.ontosoft.org/software Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
  • 12. 12Yolanda GilUSC Information Sciences Institute gil@isi.edu Describing Scientific Software in OntoSoft http://www.ontosoft.org/portal Metadata can be exported in several formats (HTML, RDF, JSON) Metadata for 3DDY Software Metadata properties collected through simple questions Set permissions for 3DDY Metadata properties organized into categories that make sense to scientists Automatic import of metadata from other repositories Indicators of metadata completeness
  • 13. 13Yolanda GilUSC Information Sciences Institute gil@isi.edu Access control http://www.ontosoft.org/portal Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016 Users and permissions for the 3DDY software component Setting permissions for editing 3DDY metadata W3CWeb access control Ontology
  • 14. 14Yolanda GilUSC Information Sciences Institute gil@isi.edu Software entries from distributed repositories are readily accessible Semantic search Comparison matrix of software entries PIHM PIHMgis DrEICH TauDEM WBMsed nto$ o%$ Metadata completion highlighted Software is contrasted by property
  • 15. 15Yolanda GilUSC Information Sciences Institute gil@isi.edu Community Learning UK Software Institute Software Carpentry CIG ESMF Critical Zone Observatory Early Career Advisory Board FES/ ESIP CSDMS EarthCube Building Blocks Recommender system Ÿ Interoperability Publication Community Learning Structured metadata Ÿ Interactive advice Ÿ Best practices Ÿ Multimedia lessons Collaborating with SEN C4P EC3 EarthCube RCNs Publication Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016 Omics Code meta initiative
  • 16. 16Yolanda GilUSC Information Sciences Institute gil@isi.edu Conclusions Software is a valuable research product • Must embed best practices of software sharing into research activities Improve productivity, quality, reproducibility OntoSoft contributions • Ontology of scientific software metadata • Portal for software registry Do you want to use Ontosoft? Let us know! http://www.ontosoft.org http://www.ontosoft.org/software http://www.ontosoft.org/portal Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
  • 17. 17Yolanda GilUSC Information Sciences Institute gil@isi.edu More Information http://www.ontosoft.org http://www.ontosoft.org/software http://www.ontosoft.org/portal http://www.ontosoft.org/gpf  OntoSoft: Capturing Scientific Software Metadata. Yolanda Gil, Varun Ratnakar, and Daniel Garijo. Proceedings of the Eighth ACM International Conference on Knowledge Capture (K-CAP), 2015.  OntoSoft: A Distributed Semantic Registry for Scientific Software. Yolanda Gil, Daniel Garijo, Saurabh Mishra, and Varun Ratnakar. Under review, 2016.  DRAT: An Unobtrusive, Scalable Approach to Large Scale Software License Analysis. Chris A. Mattmann, Ji-Hyun Oh, Tyler Palsulich, Lewis John McGibbney, Yolanda Gil, and Varun Ratnakar. Proceedings of the Fourth International Workshop on Software Mining, held in conjunction with the 30th IEEE/ACM International Conference on Automated Software Engineering (ASE), 2015.  Cyber-Innovated Watershed Research at the Shale Hills Critical Zone Observatory. Xuan Yu, Chris Duffy, Yolanda Gil, Lorne Leonard, Gopal Bhatt, and Evan Thomas. IEEE Systems Journal, to appear.  Collaborative Software Development Needs in Geosciences. Yolanda Gil, Eunyoung Moon and James Howison. Proceedings of the Second Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE2), held in conjunction with the IEEE ACM International Conference on High Performance Computing (SC), New Orleans, LA, November 2014.  Workflow Reuse in Practice: A Study of Neuroimaging Pipeline Users. Daniel Garijo, Oscar Corcho, Yolanda Gil, Meredith N. Braskie, Derrek Hibar, Xue Hua, Neda Jahanshad and, Paul Thompson and Arthur W. Toga. Proceedings of the IEEE Conference on e-Science, 2014.  FragFlow: Automated Fragment Detection in Scientific Workflows. Daniel Garijo, Oscar Corcho, Yolanda Gil, Boris A. Gutman, Ivo D. Dinov, Paul Thompson and Arthur W. Toga. Proceedings of the IEEE Conference on e-Science, Guarujua, Brazil, October 2014.  An Overview of Mobile Applications for Field Science. Anna Zeng, Kevin Zeng, Yolanda Gil, and Matty Mookerjee. GeoSoft Project Report, September 2014.  The CSDMS Standard Names: Cross-Domain Naming Conventions for Describing Process Models, Data Sets and Their Associated Variables. Scott D. Peckham. Proceedings of the Seventh International Congress on Environmental Modeling and Software, San Diego, CA, June 2014.  Web Applications that Share Level-12 HUC Data and Models of the CONUS. Lorne Leonard and Chris Duffy. Proceedings of the Seventh International Congress on Environmental Modeling and Software, San Diego, CA, June 2014.  Intelligent Workflow Systems and Provenance-Aware Software. Yolanda Gil. Proceedings of the Seventh International Congress on Environmental Modeling and Software, San Diego, CA, June 2014. Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016
  • 18. 18Yolanda GilUSC Information Sciences Institute gil@isi.edu Acknowledgements  The OntoSoft project team includes Chris Duffy (PSU), Chris Mattmann (JPL), Scott Pechkam (CU), Ji-Hyun Oh (USC), Varun Ratnakar (USC), and Erin Robinson (ESIP)  Thank you to James Howison (UT), Lisa Kempler (Matworks), and Greg Wilson (Software Carpentry) for their feedback on best practices for software sharing  Thank you to the scientists and other colleagues that have contributed ideas and asked hard questions about software stewardship  Thank you to the National Science Foundation and the EarthCube program for supporting this work EarthCube! ICER-1440323 ICER-1343800 http://www.ontosoft.org http://www.ontosoft.org/software http://www.ontosoft.org/portal http://www.ontosoft.org/gpf Yolanda Gil, Daniel Garijo, Saurabh Mishra, Varun Ratnakar eScience 2016