Presentation of the "Coming to terms to FAIR semantics" paper for 22nd International Conference on Knowledge Engineering and Knowledge Management (EKAW 2020).
FAIR Workflows: A step closer to the Scientific Paper of the Futuredgarijo
Keynote presented at the Computational and Autonomous Workflows (CAW-2021) at the Oak Ridge National Laboratory. The keynote describes an overview of the different aspects to take into account when aiming to create FAIR workflows and associated resources.
FOOPS!: An Ontology Pitfall Scanner for the FAIR principlesdgarijo
Slides presented at the DBpedia Day, at the Semantcis conference in 2021. FOOPS! (available at https://w3id.org/foops) is a validator based on the FAIR principles that will guide users when conforming their ontologies to them. For each principle, FOOPS! runs a series of tests and notifies errors, suggestions and ways to conform to the best practices.
We propose a new area of research on automating data narratives. Data narratives are containers of information about computationally generated research findings. They have three major components: 1) A record of events, that describe a new result through a workflow and/or provenance of all the computations executed; 2) Persistent entries for key entities involved for data, software versions, and workflows; 3) A set of narrative accounts that are automatically generated human-consumable renderings of the record and entities and can be included in a paper. Different narrative accounts can be used for different audiences with different content and details, based on the level of interest or expertise of the reader. Data narratives can make science more transparent and reproducible, because they ensure that the text description of the computational experiment reflects with high fidelity what was actually done. Data narratives can be incorporated in papers, either in the methods section or as supplementary materials. We introduce DANA, a prototype that illustrates how to generate data narratives automatically, and describe the information it uses from the computational records. We also present a formative evaluation of our approach and discuss potential uses of automated data narratives.
Towards Knowledge Graphs of Reusable Research Software Metadatadgarijo
Research software is a key asset for understanding, reusing and reproducing results in computational sciences. An increasing amount of software is stored in code repositories, which usually contain human readable instructions indicating how to use it and set it up. However, developers and researchers often need to spend a significant amount of time to understand how to invoke a software component, prepare data in the required format, and use it in combination with other software. In addition, this time investment makes it challenging to discover and compare software with similar functionality. In this talk I will describe our efforts to address these issues by creating and using Open Knowledge Graphs that describe research software in a machine readable manner. Our work includes: 1) an ontology that extends schema.org and codemeta, designed to describe software and the specific data formats it uses; 2) an approach to publish software metadata as an open knowledge graph, linked to other Web of Data objects; and 3) a framework for automatically extracting metadata from software repositories; and 4) a framework to curate, query, explore and compare research software metadata in a collaborative manner. The talk will illustrate our approach with real-world examples, including a domain application for inspecting and discovering hydrology, agriculture, and economic software models; and the results of our framework when enriching the research software entries in Zenodo.org.
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...dgarijo
Scientific software is crucial for understanding, reusing and reproducing results in computational sciences. Software is often stored in code repositories, which may contain human readable instructions necessary to use it and set it up. However, a significant amount of time is usually required to understand how to invoke a software component, prepare data in the format it requires, and use it in combination with other software. In this presentation we introduce OKG-Soft, an open knowledge graph that describes scientific software in a machine readable manner. OKG-Soft includes: 1) an ontology designed to describe software and the specific data formats it uses; 2) an approach to publish software metadata as an open knowledge graph, linked to other Web of Data objects; and 3) a framework to annotate, query, explore and curate scientific software metadata.
Scientific Software Registry Collaboration Workshop: From Software Metadata r...dgarijo
In this talk I briefly describe our work in OntoSoft for easy software metadata representation, and how new requirements for software reusability are making us move towards knowledge graphs of scientific software metadata
A Template-Based Approach for Annotating Long-Tailed Datasetsdgarijo
An increasing amount of data is shared on the Web through heterogeneous spreadsheets and CSV files. In order to homogenize and query these data, the scientific community has developed Extract, Transform and Load (ETL) tools and services that help making these files machine readable in Knowledge Graphs (KGs). However, tabular data may be complex; and the level of expertise required by existing ETL tools makes it difficult for users to describe their own data. In this paper we propose a simple annotation schema to guide users when transforming complex tables into KGs. We have implemented our approach by extending T2WML, a table annotation tool designed to help users annotate their data and upload the results to a public KG. We have evaluated our effort with six non-expert users, obtaining promising preliminary results.
An increasing number of researchers rely on computational methods to generate the results described in their publications. Research software created to this end is heterogeneous (e.g., scripts, libraries, packages, notebooks, etc.) and usually difficult to find, reuse, compare and understand due to its disconnected documentation (dispersed in manuals, readme files, web sites, and code comments) and a lack of structured metadata to describe it. In this talk I will describe the main challenges for finding, comparing and reusing research software, how structured metadata can help to address some of them, which are the best practices being proposed by the community; and current initiatives to aid their adoption by researchers within EOSC.
Impact: The talk addresses an important aspect of the EOSC infrastructure for quality research software by ensuring that software contributed to the EOSC ecosystem can be found, compared and reused by researchers. The talk also aims to address metadata quality of current research products, which is critical for successful adoption.
Presented at the EOSC symposium
FAIR Workflows: A step closer to the Scientific Paper of the Futuredgarijo
Keynote presented at the Computational and Autonomous Workflows (CAW-2021) at the Oak Ridge National Laboratory. The keynote describes an overview of the different aspects to take into account when aiming to create FAIR workflows and associated resources.
FOOPS!: An Ontology Pitfall Scanner for the FAIR principlesdgarijo
Slides presented at the DBpedia Day, at the Semantcis conference in 2021. FOOPS! (available at https://w3id.org/foops) is a validator based on the FAIR principles that will guide users when conforming their ontologies to them. For each principle, FOOPS! runs a series of tests and notifies errors, suggestions and ways to conform to the best practices.
We propose a new area of research on automating data narratives. Data narratives are containers of information about computationally generated research findings. They have three major components: 1) A record of events, that describe a new result through a workflow and/or provenance of all the computations executed; 2) Persistent entries for key entities involved for data, software versions, and workflows; 3) A set of narrative accounts that are automatically generated human-consumable renderings of the record and entities and can be included in a paper. Different narrative accounts can be used for different audiences with different content and details, based on the level of interest or expertise of the reader. Data narratives can make science more transparent and reproducible, because they ensure that the text description of the computational experiment reflects with high fidelity what was actually done. Data narratives can be incorporated in papers, either in the methods section or as supplementary materials. We introduce DANA, a prototype that illustrates how to generate data narratives automatically, and describe the information it uses from the computational records. We also present a formative evaluation of our approach and discuss potential uses of automated data narratives.
Towards Knowledge Graphs of Reusable Research Software Metadatadgarijo
Research software is a key asset for understanding, reusing and reproducing results in computational sciences. An increasing amount of software is stored in code repositories, which usually contain human readable instructions indicating how to use it and set it up. However, developers and researchers often need to spend a significant amount of time to understand how to invoke a software component, prepare data in the required format, and use it in combination with other software. In addition, this time investment makes it challenging to discover and compare software with similar functionality. In this talk I will describe our efforts to address these issues by creating and using Open Knowledge Graphs that describe research software in a machine readable manner. Our work includes: 1) an ontology that extends schema.org and codemeta, designed to describe software and the specific data formats it uses; 2) an approach to publish software metadata as an open knowledge graph, linked to other Web of Data objects; and 3) a framework for automatically extracting metadata from software repositories; and 4) a framework to curate, query, explore and compare research software metadata in a collaborative manner. The talk will illustrate our approach with real-world examples, including a domain application for inspecting and discovering hydrology, agriculture, and economic software models; and the results of our framework when enriching the research software entries in Zenodo.org.
OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software M...dgarijo
Scientific software is crucial for understanding, reusing and reproducing results in computational sciences. Software is often stored in code repositories, which may contain human readable instructions necessary to use it and set it up. However, a significant amount of time is usually required to understand how to invoke a software component, prepare data in the format it requires, and use it in combination with other software. In this presentation we introduce OKG-Soft, an open knowledge graph that describes scientific software in a machine readable manner. OKG-Soft includes: 1) an ontology designed to describe software and the specific data formats it uses; 2) an approach to publish software metadata as an open knowledge graph, linked to other Web of Data objects; and 3) a framework to annotate, query, explore and curate scientific software metadata.
Scientific Software Registry Collaboration Workshop: From Software Metadata r...dgarijo
In this talk I briefly describe our work in OntoSoft for easy software metadata representation, and how new requirements for software reusability are making us move towards knowledge graphs of scientific software metadata
A Template-Based Approach for Annotating Long-Tailed Datasetsdgarijo
An increasing amount of data is shared on the Web through heterogeneous spreadsheets and CSV files. In order to homogenize and query these data, the scientific community has developed Extract, Transform and Load (ETL) tools and services that help making these files machine readable in Knowledge Graphs (KGs). However, tabular data may be complex; and the level of expertise required by existing ETL tools makes it difficult for users to describe their own data. In this paper we propose a simple annotation schema to guide users when transforming complex tables into KGs. We have implemented our approach by extending T2WML, a table annotation tool designed to help users annotate their data and upload the results to a public KG. We have evaluated our effort with six non-expert users, obtaining promising preliminary results.
An increasing number of researchers rely on computational methods to generate the results described in their publications. Research software created to this end is heterogeneous (e.g., scripts, libraries, packages, notebooks, etc.) and usually difficult to find, reuse, compare and understand due to its disconnected documentation (dispersed in manuals, readme files, web sites, and code comments) and a lack of structured metadata to describe it. In this talk I will describe the main challenges for finding, comparing and reusing research software, how structured metadata can help to address some of them, which are the best practices being proposed by the community; and current initiatives to aid their adoption by researchers within EOSC.
Impact: The talk addresses an important aspect of the EOSC infrastructure for quality research software by ensuring that software contributed to the EOSC ecosystem can be found, compared and reused by researchers. The talk also aims to address metadata quality of current research products, which is critical for successful adoption.
Presented at the EOSC symposium
SOMEF: a metadata extraction framework from software documentationdgarijo
Presentation done at the council of software registries on March, 2021. SOMEF is a python package for automatically extracting over 25 metadata categories from a readme file. The output is then exported in JSON or in JSON-LD using the codemeta representation
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphsdgarijo
In this presentation we describe the Ontology-Based APIs framework (OBA), our approach to automatically create REST APIs from ontologies while following RESTful API best
practices. Given an ontology (or ontology network) OBA uses standard technologies familiar to web developers (OpenAPI Specification, JSON) and combines them with W3C standards (OWL, JSON-LD frames and SPARQL) to create maintainable APIs with documentation, units tests, automated validation of resources and clients (in Python, Javascript, etc.) for non Semantic Web experts to access the contents of a target
knowledge graph. We showcase OBA with three examples that illustrate the capabilities of the framework for different ontologies.
Towards Human-Guided Machine Learning - IUI 2019dgarijo
Automated Machine Learning (AutoML) systems are emerging
that automatically search for possible solutions from a large space of possible kinds of models. Although fully automated machine learning is appropriate for many applications, users often have knowledge that supplements and constraints the available data and solutions. This paper proposes human-guided machine learning (HGML) as a hybrid approach where a user interacts with an AutoML system and tasks it to explore different problem settings that reflect the user’s knowledge about the data available. We present: 1) a task analysis of HGML that shows the tasks that a user would want to carry out, 2) a characterization of two scientific publications, one in neuroscience and one in political science, in terms of how the authors would search for solutions using an AutoML system, 3) requirements for HGML based on those characterizations, and 4) an assessment of existing AutoML systems in terms of those requirements.
Some tools developed at OEG (Ontology Engineering Group) for facilitating ontology engineering activities as evaluation, documentation, releasing and publication.
presented at WORKS 2021
https://works-workshop.org/
16th Workshop on Workflows in Support of Large-Scale Science
November 15, 2021
Held in conjunction with SC21: The International Conference for High Performance Computing, Networking, Storage and Analysis
Keynote on software sustainability given at the 2nd Annual Netherlands eScience Symposium, November 2014.
Based on the article
Carole Goble ,
Better Software, Better Research
Issue No.05 - Sept.-Oct. (2014 vol.18)
pp: 4-8
IEEE Computer Society
http://www.computer.org/csdl/mags/ic/2014/05/mic2014050004.pdf
http://doi.ieeecomputersociety.org/10.1109/MIC.2014.88
http://www.software.ac.uk/resources/publications/better-software-better-research
Introduction of semantic technology for SAS programmersKevin Lee
There is a new technology to express and search the data that can provide more meaning and relationship –
semantic technology. The semantic technology can easily add, change and implement the meaning and relationship
to the current data. Companies such as Facebook and Google are currently using the semantic technology. For
example, Facebook Graph Search use semantic technology to enhance more meaningful search for users.
The paper will introduce the basic concepts of semantic technology and its graph data model, Resource Description
Framework (RDF). RDF can link data elements in a self-describing way with elements and property: subject,
predicate and object. The paper will introduce the application and examples of RDF elements. The paper will also
introduce three different representation of RDF: RDF/XML representation, turtle representation and N-triple
representation.
The paper will also introduce “CDISC standards RDF representation, Reference and Review Guide” published by
CDISC and PhUSE CSS. The paper will discuss RDF representation, reference and review guide and show how
CDISC standards are represented and displayed in RDF format.
The paper will also introduce Simple Protocol RDF Query Language (SPARQL) that can retrieve and manipulate data
in RDF format. The paper will show how programmers can use SPARQL to re-represent RDF format of CDISC
standards metadata into structured tabular format.
Finally, paper will discuss the benefits and futures of semantic technology. The paper will also discuss what semantic
technology means to SAS programmers and how programmers take an advantage of this new technology.
A keynote given on the FAIR Data Principles at the FAIRplus Innovation and SME Forum, Hinxton Genome Campus, Cambridge, UK, 29 January 2020. The history of the principles, issues about the principles and speculations about the future
RO-Crate: A framework for packaging research products into FAIR Research ObjectsCarole Goble
RO-Crate: A framework for packaging research products into FAIR Research Objects presented to Research Data Alliance RDA Data Fabric/GEDE FAIR Digital Object meeting. 2021-02-25
Short talk on Research Object and their use for reproducibility and publishing in the Systems Biology Commons Platform FAIRDOMHub, and the underlying software SEEK.
As BioPharma adapts to incorporate nimble networks of suppliers, collaborators, and regulators the ability to link data is critical for dynamic interoperability. Adoption of linked data paradigm allows BioPharma to focus on core business: delivering valuable therapeutics in a timely manner.
Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web María Poveda Villalón
The uptake of Linked Data (LD) has promoted the proliferation of datasets and their associated ontologies bringing their semantic to the data being published. These ontologies should be evaluated at different stages, both during their development and their publication. As important as correctly modelling the intended part of the world to be captured in an ontology, is publishing, sharing and facilitating the (re)use of the obtained model. In this paper, 11 evaluation characteristics, with respect to publish, share and facilitate the reuse, are proposed. In particular, 6 good practices and 5 pitfalls are presented, together with their associated detection methods. In addition, a grid-based rating system is generated showing the results of analysing the vocabularies gathered in LOV repository. Both contributions, the set of evaluation characteristics and the grid system, could be useful for ontologists in order to reuse existing LD vocabularies or to check the one being built.
SOMEF: a metadata extraction framework from software documentationdgarijo
Presentation done at the council of software registries on March, 2021. SOMEF is a python package for automatically extracting over 25 metadata categories from a readme file. The output is then exported in JSON or in JSON-LD using the codemeta representation
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphsdgarijo
In this presentation we describe the Ontology-Based APIs framework (OBA), our approach to automatically create REST APIs from ontologies while following RESTful API best
practices. Given an ontology (or ontology network) OBA uses standard technologies familiar to web developers (OpenAPI Specification, JSON) and combines them with W3C standards (OWL, JSON-LD frames and SPARQL) to create maintainable APIs with documentation, units tests, automated validation of resources and clients (in Python, Javascript, etc.) for non Semantic Web experts to access the contents of a target
knowledge graph. We showcase OBA with three examples that illustrate the capabilities of the framework for different ontologies.
Towards Human-Guided Machine Learning - IUI 2019dgarijo
Automated Machine Learning (AutoML) systems are emerging
that automatically search for possible solutions from a large space of possible kinds of models. Although fully automated machine learning is appropriate for many applications, users often have knowledge that supplements and constraints the available data and solutions. This paper proposes human-guided machine learning (HGML) as a hybrid approach where a user interacts with an AutoML system and tasks it to explore different problem settings that reflect the user’s knowledge about the data available. We present: 1) a task analysis of HGML that shows the tasks that a user would want to carry out, 2) a characterization of two scientific publications, one in neuroscience and one in political science, in terms of how the authors would search for solutions using an AutoML system, 3) requirements for HGML based on those characterizations, and 4) an assessment of existing AutoML systems in terms of those requirements.
Some tools developed at OEG (Ontology Engineering Group) for facilitating ontology engineering activities as evaluation, documentation, releasing and publication.
presented at WORKS 2021
https://works-workshop.org/
16th Workshop on Workflows in Support of Large-Scale Science
November 15, 2021
Held in conjunction with SC21: The International Conference for High Performance Computing, Networking, Storage and Analysis
Keynote on software sustainability given at the 2nd Annual Netherlands eScience Symposium, November 2014.
Based on the article
Carole Goble ,
Better Software, Better Research
Issue No.05 - Sept.-Oct. (2014 vol.18)
pp: 4-8
IEEE Computer Society
http://www.computer.org/csdl/mags/ic/2014/05/mic2014050004.pdf
http://doi.ieeecomputersociety.org/10.1109/MIC.2014.88
http://www.software.ac.uk/resources/publications/better-software-better-research
Introduction of semantic technology for SAS programmersKevin Lee
There is a new technology to express and search the data that can provide more meaning and relationship –
semantic technology. The semantic technology can easily add, change and implement the meaning and relationship
to the current data. Companies such as Facebook and Google are currently using the semantic technology. For
example, Facebook Graph Search use semantic technology to enhance more meaningful search for users.
The paper will introduce the basic concepts of semantic technology and its graph data model, Resource Description
Framework (RDF). RDF can link data elements in a self-describing way with elements and property: subject,
predicate and object. The paper will introduce the application and examples of RDF elements. The paper will also
introduce three different representation of RDF: RDF/XML representation, turtle representation and N-triple
representation.
The paper will also introduce “CDISC standards RDF representation, Reference and Review Guide” published by
CDISC and PhUSE CSS. The paper will discuss RDF representation, reference and review guide and show how
CDISC standards are represented and displayed in RDF format.
The paper will also introduce Simple Protocol RDF Query Language (SPARQL) that can retrieve and manipulate data
in RDF format. The paper will show how programmers can use SPARQL to re-represent RDF format of CDISC
standards metadata into structured tabular format.
Finally, paper will discuss the benefits and futures of semantic technology. The paper will also discuss what semantic
technology means to SAS programmers and how programmers take an advantage of this new technology.
A keynote given on the FAIR Data Principles at the FAIRplus Innovation and SME Forum, Hinxton Genome Campus, Cambridge, UK, 29 January 2020. The history of the principles, issues about the principles and speculations about the future
RO-Crate: A framework for packaging research products into FAIR Research ObjectsCarole Goble
RO-Crate: A framework for packaging research products into FAIR Research Objects presented to Research Data Alliance RDA Data Fabric/GEDE FAIR Digital Object meeting. 2021-02-25
Short talk on Research Object and their use for reproducibility and publishing in the Systems Biology Commons Platform FAIRDOMHub, and the underlying software SEEK.
As BioPharma adapts to incorporate nimble networks of suppliers, collaborators, and regulators the ability to link data is critical for dynamic interoperability. Adoption of linked data paradigm allows BioPharma to focus on core business: delivering valuable therapeutics in a timely manner.
Detecting Good Practices and Pitfalls when Publishing Vocabularies on the Web María Poveda Villalón
The uptake of Linked Data (LD) has promoted the proliferation of datasets and their associated ontologies bringing their semantic to the data being published. These ontologies should be evaluated at different stages, both during their development and their publication. As important as correctly modelling the intended part of the world to be captured in an ontology, is publishing, sharing and facilitating the (re)use of the obtained model. In this paper, 11 evaluation characteristics, with respect to publish, share and facilitate the reuse, are proposed. In particular, 6 good practices and 5 pitfalls are presented, together with their associated detection methods. In addition, a grid-based rating system is generated showing the results of analysing the vocabularies gathered in LOV repository. Both contributions, the set of evaluation characteristics and the grid system, could be useful for ontologists in order to reuse existing LD vocabularies or to check the one being built.
Clariah Tech Day: Controlled Vocabularies and Ontologies in Dataversevty
This presentation is about external CVs support in Dataverse, Open Source data repository. Data Archiving and Networked Services (DANS-KNAW) decided to use Dataverse as a basic technology to build Data Stations and provide FAIR data services for various Dutch research communities.
FAIR Data
Principles
FAIR vs Open Data
Implementing FAIR & FAIRmetrics
FAIRness de ASIO-HERCULES
Research Objects
Definition
Standard RO-CRATE
Usage examples
Making Data FAIR (Findable, Accessible, Interoperable, Reusable)Tom Plasterer
What to do About FAIR…
In the experience of most pharma professionals, FAIR remains fairly abstract, bordering on inconclusive. This session will outline specific case studies – real problems with real data, and address opportunities and real concerns.
·
Why making data Findable, Actionable, Interoperable and Reusable is important.
Talk presented at the Data Driven Drug Development (D4) conference on March 20th, 2019.
Environmental Thesauri Under the Lens of Reusability (EGOVIS 2014)Riccardo Albertoni
The development of a Spatial Data Infrastructure (SDI) at
European level is strategic to answer the needs of environmental management requested by the European, national and local policies. Several European projects and initiatives aim to share, integrate and make accessible large amount of environmental data in order to overcome cross-
border/language/cultural barriers. To this purpose, environmental thesauri are used as shared nomenclatures in metadata compilation and information discovery, and they are increasingly made available on the web.
This paper provides a methodological approach for creating a catalogue of the environmental thesauri available on the web and assessing their reusability with respect to domain independent criteria. It highlights critical issues providing some recommendations for improving thesauri reusability.
Presentation investigating the state of FAIR practice and what is needed to turn FAIR data into reality given at the Danish FAIR conference in Copenhagen on 20th November 2018. https://vidensportal.deic.dk/en/Programme/FAIR_Toolbox_Nov2018 The presentation reflect on recent FAIR studies and international initiatives and outlines the recommendations emerging from the European Commission's FAIR Data Expert Group report - http://tinyurl.com/FAIR-EG
Due to the increasing uptake of semantic technologies, ontologies are becoming part of a growing number of software development projects. As a result, ontology development teams have to combine their activities with software development practices. In this presentation some practices, tools and examples of new trends in ontological engineering are provided.
Presentation for the paper: "Semantic Discovery in the Web of Things" at http://sisinflab.poliba.it/EnWoT/2017/
Abstract:
While the number of things present in the Web grows, the ability of discovering such things in order to successfully interact with them becomes a challenge, mainly due to heterogeneity.
The contribution of this paper is two-fold. First, an ontology-based approach to leverage web things discovery that is transparent to the syntax, protocols and formats used in things interfaces is described. Second, a semantic model for describing web things and how to extract and understand the relevant information for discovery is proposed.
Introduction to Linked Open Vocabularies http://lov.okfn.org/ during EUDAT2017 https://eudat.eu/events/trainings/eudat-semantic-working-group-at-9th-rda-plenary-barcelona-3-4-april-2017
Ontology Evaluation: a pitfall-based approach to ontology diagnosisMaría Poveda Villalón
Ontology evaluation, which includes ontology diagnosis and repair, is a complex activity that should be carried out in every ontology development project, because it checks for the technical quality of the ontology. However, there is an important gap between the methodological work about ontology evaluation and the tools that support such an activity. More precisely, not many approaches provide clear guidance about how to diagnose ontologies and how to repair them accordingly.
This thesis aims to advance the current state of the art of ontology evaluation, specifically in the ontology diagnosis activity. The main goals of this thesis are (a) to help ontology engineers to diagnose their ontologies in order to find common pitfalls and (b) to lessen the effort required from them by providing the suitable technological support. This thesis presents the following main contributions:
• A catalogue that describes 41 pitfalls that ontology developers might include in their ontologies.
• A quality model for ontology diagnose that aligns the pitfall catalogue to existing quality models for semantic technologies.
• The design and implementation of 48 methods for detecting 33 out of the 41 pitfalls defined in the catalogue.
• A system called OOPS! (OntOlogy Pitfall Scanner!) that allows ontology engineers to (semi)automatically diagnose their ontologies.
According to the feedback gathered and satisfaction tests carried out, the approach developed and presented in this thesis effectively helps users to increase the quality of their ontologies. At the time of writing this thesis, OOPS! has been broadly accepted by a high number of users worldwide and has been used around 3000 times from 60 different countries. OOPS! is integrated with third-party software and is locally installed in private enterprises being used both for ontology development activities and training courses.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfPeter Spielvogel
Building better applications for business users with SAP Fiori.
• What is SAP Fiori and why it matters to you
• How a better user experience drives measurable business benefits
• How to get started with SAP Fiori today
• How SAP Fiori elements accelerates application development
• How SAP Build Code includes SAP Fiori tools and other generative artificial intelligence capabilities
• How SAP Fiori paves the way for using AI in SAP apps
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
GridMate - End to end testing is a critical piece to ensure quality and avoid...ThomasParaiso2
End to end testing is a critical piece to ensure quality and avoid regressions. In this session, we share our journey building an E2E testing pipeline for GridMate components (LWC and Aura) using Cypress, JSForce, FakerJS…
GridMate - End to end testing is a critical piece to ensure quality and avoid...
Coming to terms to FAIR semantics
1. María Poveda-Villalón1, Paola Espinoza-Arias1,
Daniel Garijo2, Oscar Corcho1
1Ontology Engineering Group
Universidad Politécnica de Madrid
2Information Sciences Institute
University of Southern California
mpoveda@fi.upm.es,
pespinoza@fi.upm.es,
dgarijo@isi.edu,
ocorcho@fi.upm.es
18 September 2020
Virtual – EKAW2020
Coming to terms to FAIR
Ontologies
22nd International Conference on Knowledge
Engineering and Knowledge Management
This work has been supported by a Predoctoral grant from the I+D+i program of the Universidad Politécnica
de Madrid and the Spanish project DATOS 4.0: RETOS Y SOLUCIONES (TIN2016-78011-C4-4-R).
2. “Coming to terms to FAIR Ontologies” by María Poveda-Villalón, Paola Espinoza-Arias, Daniel Garijo and Oscar Corcho
Introduction
2
Linked
Data
Open
Data
FAIR
Data
Image taken from https://www.w3.org/DesignIssues/LinkedData.html
Linked Data principles
Adoption:
• EOSC interoperability framework
• Research Data Alliance
Wilkinson, M. D. et al. The FAIR Guiding Principles for scientific data
management and stewardship. https://doi.org/10.1038/sdata.2016.18 (2016)
https://www.nature.com/articles/sdata201618
3. “Coming to terms to FAIR Ontologies” by María Poveda-Villalón, Paola Espinoza-Arias, Daniel Garijo and Oscar Corcho
Introduction
3
§ There is a clear movement towards expanding the application of
the FAIR principles beyond research data [EOSC Interoperability
Framework]
§ Ontologies are often the result of research activities or
fundamental components in many research areas
§ Some initiatives (FAIRsFAIR EU Project recommendations, GO-FAIR
implementation network GO-INTER, RDA Vocabulary Services Interest
Group, “Best Practices for Implementing FAIR Vocabularies and
Ontologies on the Web”…)
How do these works fit with the Ontology Engineering community?
There is a need to open a broader discussion of the technical and social
consequences of adopting the FAIR principles for the publication and sharing
of ontologies, and that such discussion should incorporate the views of the
Ontology Engineering community;
4. “Coming to terms to FAIR Ontologies” by María Poveda-Villalón, Paola Espinoza-Arias, Daniel Garijo and Oscar Corcho
Outline
4
1) Comparing existing approaches
2) Semantic Web practices to be adopted and open issues for FAIR Ontologies
5. “Coming to terms to FAIR Ontologies” by María Poveda-Villalón, Paola Espinoza-Arias, Daniel Garijo and Oscar Corcho
17 recommendations, related to one or more FAIR principles related to:
q GUPRIs (Global Unique Persistent and Resolvable Identifier)
q (minimum) metadata including provenance, license, etc.
q Semantic repositories
• API
• Cross access
• Secure protocols
q Use standards (languages, vocabularies)
q Mappings (between artefacts, to foundational ontologies)
Framework
5
Le Franc, Y., Parland-von Essen, J., Bonino, L., Lehväslaiho, et al., . D2.2 FAIR semantics: First recommendations (2020)
https://doi.org/10.5281/zenodo.3707985
6. “Coming to terms to FAIR Ontologies” by María Poveda-Villalón, Paola Espinoza-Arias, Daniel Garijo and Oscar Corcho
Framework
6
“Best Practices for Implementing
FAIR Vocabularies and Ontologies
on the Web”
SemanticWeb&
OntologyEngineering
10 guidelines for publishing FAIR ontologies and vocabularies related to:
q Accessible and permanent ontology URIs
q Generation of reusable documentation (metadata and human oriented)
q Publication of ontologies on the Web (formats, findable)
Garijo, Daniel, and María Poveda-Villalón. "Best Practices for Implementing FAIR Vocabularies and Ontologies
on the Web." arXiv preprint arXiv:2003.13084 (2020)
7. “Coming to terms to FAIR Ontologies” by María Poveda-Villalón, Paola Espinoza-Arias, Daniel Garijo and Oscar Corcho
Framework
7
“Best Practices for Implementing
FAIR Vocabularies and Ontologies
on the Web”
SemanticWeb&
OntologyEngineering
★ Publish your vocabulary on the Web at a stable
URI with a open license
★ ★ Provide human-readable documentation and
basic metadata such as creator,publisher, date of
creation, last modification, version number
★ ★ ★ Provide labels and descriptions, if possible
in several languages, to make your vocabulary
usable in multiple linguistic scopes
★ ★ ★ ★ Make your vocabulary available via its
namespace URI, both as a formal file and human-
readable documentation, using content negotiation
★ ★ ★ ★ ★ Link to other vocabularies by re-using
elements rather than re-inventing
5-star vocabularies
Vatant, Bernard 2012
5-star vocabularies
SWJ 2014
Vatant, Bernard. ”5-stars for vocabularies.”
https://bvatant.blogspot.com/2012/02/is-your-linked-data-
vocabulary-5-star_9588.html (2012)
★ There is dereferenceable human-readable
information about the used vocabulary
★ ★ The information is available as machine-
readable explicit axiomatization of the vocabulary
★ ★ ★ The vocabulary is linked to other
vocabularies
★ ★ ★ ★ Metadata about the vocabulary is
available (in a dereferencable and machine-
readable form)
★ ★ ★ ★ ★ The vocabulary is linked to by other
vocabularies
Janowicz, K., Hitzler, P., Adams, B., Kolas, D., & Vardeman, I.
I. (2014). C. Five Stars of Linked Data Vocabulary Use.
Semantic Web, 5-3.
8. “Coming to terms to FAIR Ontologies” by María Poveda-Villalón, Paola Espinoza-Arias, Daniel Garijo and Oscar Corcho
Framework
8
“Best Practices for Implementing
FAIR Vocabularies and Ontologies
on the Web”
SemanticWeb&
OntologyEngineering
5-star vocabularies
Vatant, Bernard 2012
5-star vocabularies
SWJ 2014
9. “Coming to terms to FAIR Ontologies” by María Poveda-Villalón, Paola Espinoza-Arias, Daniel Garijo and Oscar Corcho
Framework
9
“Best Practices for Implementing
FAIR Vocabularies and Ontologies
on the Web”
SemanticWeb&
OntologyEngineering
5-star vocabularies
Vatant, Bernard 2012
5-star vocabularies
SWJ 2014
ü Feedback to FAIRsFAIR project:
q Merge guidelines
q Reconsider mappings to FAIR
principles
q Relax and broaden the scope of
Foundational ontologies
q Clarify standard vs not standard
technologies
10. “Coming to terms to FAIR Ontologies” by María Poveda-Villalón, Paola Espinoza-Arias, Daniel Garijo and Oscar Corcho
Framework
10
“Best Practices for Implementing
FAIR Vocabularies and Ontologies
on the Web”
SemanticWeb&
OntologyEngineering
5-star vocabularies
Vatant, Bernard 2012
5-star vocabularies
SWJ 2014
• Observations. Principles not addressed:
q F3 (metadata and data linked)
§ In SW normally embedded
q A1.2 (protocol authentication)
§ In SW normally open license (5-stars 2012)
and HTTP(s) protocol (LOD principles)
q A2 (metadata available without data)
§ In SW normally embedded
§ Web: Resources might become unavailable
11. “Coming to terms to FAIR Ontologies” by María Poveda-Villalón, Paola Espinoza-Arias, Daniel Garijo and Oscar Corcho
Outline
11
1) Comparing existing approaches
2) Semantic Web practices to be adopted and open issues for FAIR Ontologies
12. “Coming to terms to FAIR Ontologies” by María Poveda-Villalón, Paola Espinoza-Arias, Daniel Garijo and Oscar Corcho
Towards FAIR Ontologies – To be Findable
12
Keep from SW Needs Discussion
URIs PersistenceF1
Minimum metadata,
technical guidelines
Metadata included in the
ontology
Metadata as a separate
object, third-party certifier
F3
F4 DCAT2
F2
Federation model, SAODs
F1: (meta)data are assigned a globally unique and persistent identifier
F2: data are described with rich metadata (defined by R1 below)
F3: metadata clearly and explicitly include the identifier of the data it
describes
F4: (meta)data are registered or indexed in a searchable resource
13. “Coming to terms to FAIR Ontologies” by María Poveda-Villalón, Paola Espinoza-Arias, Daniel Garijo and Oscar Corcho
Towards FAIR Ontologies – To be Accesible
13
Keep from SW Needs Discussion
URIs PersistenceF1
Minimum metadata,
technical guidelines
Metadata included in the
ontology
Metadata as a separate
object, third-party certifier
F3
F4 DCAT2
F2
Federation model, SAODs
HTTP and HTTPSA1, A1.1, A1.2
Preservation policiesA2
A1: (meta)data are retrievable by their identifier using a standardized
communications protocol
A1.1: the protocol is open, free, and universally implementable
A1.2: the protocol allows for an authentication and authorization
procedure, where necessary
A2: metadata are accessible, even when the data are no longer
available
14. “Coming to terms to FAIR Ontologies” by María Poveda-Villalón, Paola Espinoza-Arias, Daniel Garijo and Oscar Corcho
Towards FAIR Ontologies – To be Interoperable
14
Keep from SW Needs Discussion
URIs PersistenceF1
Minimum metadata,
technical guidelines
Metadata included in the
ontology
Metadata as a separate
object, third-party certifier
F3
F4 DCAT2
F2
Federation model, SAODs
HTTP and HTTPSA1, A1.1, A1.2
Preservation policiesA2
KR languagesI1
Methods to reuse ontologiesI2
Mechanisms to reference
ontologies
I3
Indicators
Not force to reuse FAIR
vocabularies
I1: (meta)data use a formal, accessible, shared, and broadly applicable
language for knowledge representation.
I2: (meta)data use vocabularies that follow FAIR principles
I3: (meta)data include qualified references to other (meta)data
15. “Coming to terms to FAIR Ontologies” by María Poveda-Villalón, Paola Espinoza-Arias, Daniel Garijo and Oscar Corcho
Towards FAIR Ontologies – To be Reusable
15
Keep from SW Needs Discussion
URIs PersistenceF1
Minimum metadata,
technical guidelines
Metadata included in the
ontology
Metadata as a separate
object, third-party certifier
F3
F4 DCAT2
F2
Federation model, SAODs
HTTP and HTTPSA1, A1.1, A1.2
Preservation policiesA2
KR languagesI1
Methods to reuse ontologiesI2
Mechanisms to reference
ontologies
I3
Indicators
Not force to reuse FAIR
vocabularies
R1: (meta)data use a formal, accessible, shared, and broadly
applicable language for knowledge representation
R1.1: (meta)data are released with a clear and accessible data usage
license
R1.2: (meta)data are associated with detailed provenance
Link to the license URI or
RDF description of it
R1.1
R1.2 PROV-O
R1.3 Community standards
R1
Best practices for document
and communicate ontologies
16. “Coming to terms to FAIR Ontologies” by María Poveda-Villalón, Paola Espinoza-Arias, Daniel Garijo and Oscar Corcho
Towards FAIR Ontologies
16
Keep from SW Needs Discussion
URIs PersistenceF1
Minimum metadata,
technical guidelines
Metadata included in the
ontology
Metadata as a separate
object, third-party certifier
F3
F4 DCAT2
F2
Federation model, SAODs
HTTP and HTTPSA1, A1.1, A1.2
Preservation policiesA2
KR languagesI1
Methods to reuse ontologiesI2
Mechanisms to reference
ontologies
I3
Indicators
Not force to reuse FAIR
vocabularies
Link to the license URI or
RDF description of it
R1.1
R1.2 PROV-O
R1.3 Community standards
R1
Best practices for document
and communicate ontologies
17. María Poveda-Villalón1, Paola Espinoza-Arias1,
Daniel Garijo2, Oscar Corcho1
1Ontology Engineering Group
Universidad Politécnica de Madrid
2Information Sciences Institute
University of Southern California
mpoveda@fi.upm.es,
pespinoza@fi.upm.es,
dgarijo@isi.edu,
ocorcho@fi.upm.es
18 September 2020
Virtual – EKAW2020
Coming to terms to FAIR
Ontologies
22nd International Conference on Knowledge
Engineering and Knowledge Management
This work has been supported by a Predoctoral grant from the I+D+i program of the Universidad Politécnica
de Madrid and the Spanish project DATOS 4.0: RETOS Y SOLUCIONES (TIN2016-78011-C4-4-R).