The document discusses the objectives and outcomes of the FAIRport Skunkworks team so far. The team is exploring existing technologies to build prototype FAIRport code components using existing standards. They aim to enable findable, accessible, interoperable, and reusable data across repositories. However, repositories use different metadata schemas and standards like DCAT in incomplete ways. The team proposes "FAIR Profiles" - a generic way to describe metadata fields and constraints for any repository using a standardized vocabulary and structure. This would enable rich queries across repositories. They define a FAIR Profile Schema to serve as a lightweight meta-meta-descriptor for describing diverse repository metadata schemas in a consistent way.
Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...datascienceiqss
It would be useful to be able to discover what kinds of data are contained in the myriad general-purpose public data repositories. It would be even better if it were possible to query that data and/or have that data conform to a particular context-dependent data format. This was the ambition of the Data FAIRport project. I will be demonstrating the "strawman" demonstration of a fully-functional Data FAIRport, where the meta/data in a public repository can be "projected" into one of a number of different context-dependent formats, such that it can be cross-queried in combination with the (potentially "projected") data from other repositories.
How to describe a dataset. Interoperability issuesValeria Pesce
Presented by Valeria Pesce during the pre-meeting of the Agricultural Data Interoperability Interest Group (IGAD) of the Research Data Alliance (RDA), held on 21 and 22 September 2015 in Paris at INRA.
20161004 “Open Data Web” – A Linked Open Data Repository Built with CKANandrea huang
Our team is in Madrid (#CKANCon) to introduce our #LODLAM implementation. The http://data.odw.tw just out. (Slides at https://goo.gl/KJApV8 ) If you are at #IODC16, you are also welcome to discuss with our team in person. #opendata
More introduction about data.odw.tw can be accessed at https://goo.gl/YUSI74 (chinese) and https://goo.gl/2u07Ap (english).
Linking Open, Big Data Using Semantic Web Technologies - An IntroductionRonald Ashri
The Physics Department of the University of Cagliari and the Linkalab Group invited me to talk about the Semantic Web and Linked Data - this is simply an introduction to the technologies involved.
This chapter introduces the semantic modeling procedure, detailing its technical characteristics, possibilities and limitations. First, we present the languages that are used for semantic description. We present RDF, RDFS and OWL, describe their expressiveness in terms of describing Web Resources, and the abilities they provide in order to describe, query, administer and manage resources at a semantic layer. Next, we present the vocabularies that are used in order to provide common grounds in understanding and communicating ideas and concepts. The technologies, together with the vocabularies used, altogether comprise the modern landscape of Semantic Web/Linked Data applications and serve as the basis for maintaining, analyzing datasets and building applications on top of them.
Data FAIRport Skunkworks: Common Repository Access Via Meta-Metadata Descript...datascienceiqss
It would be useful to be able to discover what kinds of data are contained in the myriad general-purpose public data repositories. It would be even better if it were possible to query that data and/or have that data conform to a particular context-dependent data format. This was the ambition of the Data FAIRport project. I will be demonstrating the "strawman" demonstration of a fully-functional Data FAIRport, where the meta/data in a public repository can be "projected" into one of a number of different context-dependent formats, such that it can be cross-queried in combination with the (potentially "projected") data from other repositories.
How to describe a dataset. Interoperability issuesValeria Pesce
Presented by Valeria Pesce during the pre-meeting of the Agricultural Data Interoperability Interest Group (IGAD) of the Research Data Alliance (RDA), held on 21 and 22 September 2015 in Paris at INRA.
20161004 “Open Data Web” – A Linked Open Data Repository Built with CKANandrea huang
Our team is in Madrid (#CKANCon) to introduce our #LODLAM implementation. The http://data.odw.tw just out. (Slides at https://goo.gl/KJApV8 ) If you are at #IODC16, you are also welcome to discuss with our team in person. #opendata
More introduction about data.odw.tw can be accessed at https://goo.gl/YUSI74 (chinese) and https://goo.gl/2u07Ap (english).
Linking Open, Big Data Using Semantic Web Technologies - An IntroductionRonald Ashri
The Physics Department of the University of Cagliari and the Linkalab Group invited me to talk about the Semantic Web and Linked Data - this is simply an introduction to the technologies involved.
This chapter introduces the semantic modeling procedure, detailing its technical characteristics, possibilities and limitations. First, we present the languages that are used for semantic description. We present RDF, RDFS and OWL, describe their expressiveness in terms of describing Web Resources, and the abilities they provide in order to describe, query, administer and manage resources at a semantic layer. Next, we present the vocabularies that are used in order to provide common grounds in understanding and communicating ideas and concepts. The technologies, together with the vocabularies used, altogether comprise the modern landscape of Semantic Web/Linked Data applications and serve as the basis for maintaining, analyzing datasets and building applications on top of them.
This presentation presents OpenLink Virtuoso -- The Prometheus of RDF -- including Linked Data Verticals and Patterns, involving Web and Big Data, SPARQL and RDF, RDF Tax and many others.
Presentation for CLARIAH IG Linked Open Data on the latest developments for Dataverse FAIR data repository. Building SEMAF workflow with external controlled vocabularies support and Semantic API.
Transient and persistent RDF views over relational databases in the context o...Nikolaos Konstantinou
As far as digital repositories are concerned, numerous benefits emerge from the disposal of their contents as Linked Open Data (LOD). This leads more and more repositories towards this direction. However, several factors need to be taken into account in doing so, among which is whether the transition needs to be materialized in real-time or in asynchronous time intervals. In this paper we provide the problem framework in the context of digital repositories, we discuss the benefits and drawbacks of both approaches and draw our conclusions after evaluating a set of performance measurements. Overall, we argue that in contexts with infrequent data updates, as is the case with digital repositories, persistent RDF views are more efficient than real-time SPARQL-to-SQL rewriting systems in terms of query response times, especially when expensive SQL queries are involved.
The Bounties of Semantic Data Integration for the Enterprise Ontotext
If you are looking for solutions that allow you not only to manage all of your data (structured, semi-structured and unstructured) but to also make the most out of them, using a common language is critical.
Adding Semantic Technology to data integration is the glue that holds together all your enterprise data and their relationships in a meaningful way.
Learn how you can quickly design data processing jobs and integrate massive amounts of data and see what semantic integration can do for your data and your business.
www.ontotext.com
Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...andrea huang
Will the rich domain knowledge from research publications and the implicit cross-domain metadata of cultural objects be compliant with each other? A contextual framework is proposed as dynamic and relational in supporting three different contexts: Reusing, Publication and Curation, which are individually constructed but overlapped with major conceptual elements. A Relations for Reusing (R4R) ontology has been devised for modeling these overlapping
conceptual components (Article, Data, Code, Provence, and License) for interlinking research outputs and cultural heritage data. In particular, packaging and citation relations are key to build up interpretations for dynamic contexts. Examples are provided for illustrating how the linking mechanism can be constructed and represented as a result to reveal the data linked in different contexts.
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018Ontotext
These are slides from a live webinar taken place January 2018.
GraphDB™ Fundamentals builds the basis for working with graph databases that utilize the W3C standards, and particularly GraphDB™. In this webinar, we demonstrated how to install and set-up GraphDB™ 8.4 and how you can generate your first RDF dataset. We also showed how to quickly integrate complex and highly interconnected data using RDF and SPARQL and much more.
With the help of GraphDB™, you can start smartly managing your data assets, visually represent your data model and get insights from them.
An Approach for the Incremental Export of Relational Databases into RDF GraphsNikolaos Konstantinou
Several approaches have been proposed in the literature for offering RDF views over databases. In addition to these, a variety of tools exist that allow exporting database contents into RDF graphs. The approaches in the latter category have often been proved demonstrating better performance than the ones in the former. However, when database contents are exported into RDF, it is not always optimal or even necessary to export, or dump as this procedure is often called, the whole database contents every time. This paper investigates the problem of incremental generation and storage of the RDF graph that is the result of exporting relational database contents. In order to express mappings that associate tuples from the source database to triples in the resulting RDF graph, an implementation of the R2RML standard is subject to testing. Next, a methodology is proposed and described that enables incremental generation and storage of the RDF graph that originates from the source relational database contents. The performance of this methodology is assessed, through an extensive set of measurements. The paper concludes with a discussion regarding the authors' most important findings.
In this Chapter, we consider relational databases as a data source for the generation of Linked Data, given that they constitute one of the most popular data storage media, containing huge data volumes that feed the vast majority of information systems worldwide. In this context, we review the related literature and reveal the main motivations that fuel the relevant approaches, and the benefits that arise from their application. We present a categorization of approaches that map relational databases to the Semantic Web and identify tool implementations that extract RDF graphs from relational database instances. We also sketch a proof-of-concept use case scenario regarding how a repository with scholarly information can be converted to a Linked Data endpoint. The Chapter ends with a discussion of the open issues and future outlook for the problem of RDF generation from relational databases.
Analytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data LinkingOntotext
A presentation of Ontotext’s CEO Atanas Kiryakov, given during Semantics 2018 - an annual conference that brings together researchers and professionals from all over the world to share knowledge and expertise on semantic computing.
This presentation presents OpenLink Virtuoso -- The Prometheus of RDF -- including Linked Data Verticals and Patterns, involving Web and Big Data, SPARQL and RDF, RDF Tax and many others.
Presentation for CLARIAH IG Linked Open Data on the latest developments for Dataverse FAIR data repository. Building SEMAF workflow with external controlled vocabularies support and Semantic API.
Transient and persistent RDF views over relational databases in the context o...Nikolaos Konstantinou
As far as digital repositories are concerned, numerous benefits emerge from the disposal of their contents as Linked Open Data (LOD). This leads more and more repositories towards this direction. However, several factors need to be taken into account in doing so, among which is whether the transition needs to be materialized in real-time or in asynchronous time intervals. In this paper we provide the problem framework in the context of digital repositories, we discuss the benefits and drawbacks of both approaches and draw our conclusions after evaluating a set of performance measurements. Overall, we argue that in contexts with infrequent data updates, as is the case with digital repositories, persistent RDF views are more efficient than real-time SPARQL-to-SQL rewriting systems in terms of query response times, especially when expensive SQL queries are involved.
The Bounties of Semantic Data Integration for the Enterprise Ontotext
If you are looking for solutions that allow you not only to manage all of your data (structured, semi-structured and unstructured) but to also make the most out of them, using a common language is critical.
Adding Semantic Technology to data integration is the glue that holds together all your enterprise data and their relationships in a meaningful way.
Learn how you can quickly design data processing jobs and integrate massive amounts of data and see what semantic integration can do for your data and your business.
www.ontotext.com
Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...andrea huang
Will the rich domain knowledge from research publications and the implicit cross-domain metadata of cultural objects be compliant with each other? A contextual framework is proposed as dynamic and relational in supporting three different contexts: Reusing, Publication and Curation, which are individually constructed but overlapped with major conceptual elements. A Relations for Reusing (R4R) ontology has been devised for modeling these overlapping
conceptual components (Article, Data, Code, Provence, and License) for interlinking research outputs and cultural heritage data. In particular, packaging and citation relations are key to build up interpretations for dynamic contexts. Examples are provided for illustrating how the linking mechanism can be constructed and represented as a result to reveal the data linked in different contexts.
Transforming Your Data with GraphDB: GraphDB Fundamentals, Jan 2018Ontotext
These are slides from a live webinar taken place January 2018.
GraphDB™ Fundamentals builds the basis for working with graph databases that utilize the W3C standards, and particularly GraphDB™. In this webinar, we demonstrated how to install and set-up GraphDB™ 8.4 and how you can generate your first RDF dataset. We also showed how to quickly integrate complex and highly interconnected data using RDF and SPARQL and much more.
With the help of GraphDB™, you can start smartly managing your data assets, visually represent your data model and get insights from them.
An Approach for the Incremental Export of Relational Databases into RDF GraphsNikolaos Konstantinou
Several approaches have been proposed in the literature for offering RDF views over databases. In addition to these, a variety of tools exist that allow exporting database contents into RDF graphs. The approaches in the latter category have often been proved demonstrating better performance than the ones in the former. However, when database contents are exported into RDF, it is not always optimal or even necessary to export, or dump as this procedure is often called, the whole database contents every time. This paper investigates the problem of incremental generation and storage of the RDF graph that is the result of exporting relational database contents. In order to express mappings that associate tuples from the source database to triples in the resulting RDF graph, an implementation of the R2RML standard is subject to testing. Next, a methodology is proposed and described that enables incremental generation and storage of the RDF graph that originates from the source relational database contents. The performance of this methodology is assessed, through an extensive set of measurements. The paper concludes with a discussion regarding the authors' most important findings.
In this Chapter, we consider relational databases as a data source for the generation of Linked Data, given that they constitute one of the most popular data storage media, containing huge data volumes that feed the vast majority of information systems worldwide. In this context, we review the related literature and reveal the main motivations that fuel the relevant approaches, and the benefits that arise from their application. We present a categorization of approaches that map relational databases to the Semantic Web and identify tool implementations that extract RDF graphs from relational database instances. We also sketch a proof-of-concept use case scenario regarding how a repository with scholarly information can be converted to a Linked Data endpoint. The Chapter ends with a discussion of the open issues and future outlook for the problem of RDF generation from relational databases.
Analytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data LinkingOntotext
A presentation of Ontotext’s CEO Atanas Kiryakov, given during Semantics 2018 - an annual conference that brings together researchers and professionals from all over the world to share knowledge and expertise on semantic computing.
Depuis le 6 septembre 2010 le compte @F3Lorraine est présent sur Twitter. Pourquoi, comment, quelles nouvelles pratiques ont accompagnées ce choix éditorial ? Quelques éléments de réponse par Jean-Christophe Dupuis-Rémond, journaliste et community manager de la rédaction régionale de France Télévisions pour la Lorraine.
Data FAIRport Prototype & Demo - Presentation to Elsevier, Jul 10, 2015Mark Wilkinson
A discussion and demonstration of a functional Data FAIRport, using W3C's Linked Data Platform, Ruben Verborgh's Linked Data Fragments, and Hydra's hypermedia controlled vocabularies. This is the output of the "Skunkworks" working group of the larger Data FAIRport project (http://datafairport.org).
Panel presentation to a graduate class at the University of Arizona School of Information Resources and Library Science. Invited by Dr. Jana Bradley. July 2006.
Metadata & brokering - a modern approach #2Daniele Bailo
The second episode of metadata and brokering.
Topics covered:
1. additional definition (ontology, relational database and others)
2. the wide picture: data fabric elements from Research Data Alliance (RDA) and possible concrete implementations of those guidelines
FAIR Data Prototype - Interoperability and FAIRness through a novel combinati...Mark Wilkinson
This slide deck accompanies the manuscript "Interoperability and FAIRness through a novel combination of Web technologies", submitted to PeerJ Computer Science: https://doi.org/10.7287/peerj.preprints.2522v1
It describes the output of the "Skunkworks" FAIR implementation group, who were tasked with building a prototype infrastructure that would fulfill the FAIR Principles for scholarly data publishing. We show how a novel combination of the Linked Data Platform, RDF Mapping Language (RML) and Triple Pattern Fragments (TPF) can be combined to create a scholarly publishing infrastructure that is markedly interoperable, at both the metadata and the data level.
This slide deck (or something close) will be presented at the Dutch Techcenter for Life Sciences Partners Workshop, November 4, 2016.
Spanish Ministerio de Economía y Competitividad grant number TIN2014-55993-R
This presentation was provided by Vinod Chachra of VTLS Inc. during the NISO event "Next Generation Discovery Tools: New Tools, Aging Standards," held March 27 - March 28, 2008.
Presentation by Luiz Olavo Bonino about the current state of the developments on FAIR Data supporting tools at the Dutch Techcentre for Life Sciences Partners Event on November 3-4 2016.
Data Wrangling and Visualization Using PythonMOHITKUMAR1379
Python is open source and has so many libraries for data wrangling and visualization that makes life of data scientists easier. For data wrangling pandas is used as it represent tabular data and it has other function to parse data from different sources, data cleaning, handling missing values, merging data sets etc. To visualize data, low level matplotlib can be used. But it is a base package for other high level packages such as seaborn, that draw well customized plot in just one line of code. Python has dash framework that is used to make interactive web application using python code without javascript and html. These dash application can be published on any server as well as on clouds like google cloud but freely on heroku cloud.
Tech. session : Interoperability and Data FAIRness emerges from a novel combi...Mark Wilkinson
My presentation to OAI10 - CERN - UNIGE Workshop on Innovations in Scholarly Communication, 21-23 June 2017
University of Geneva.
https://indico.cern.ch/event/405949/contributions/2487823/
A description of the FAIR Accessor and FAIR Projector technologies: REST-compliant approaches to publishing FAIR Metadata and FAIR Data (respectively)
Spanish Ministerio de Economía y Competitividad TIN2014-55993-R
Metadata management for data storage spaces :
INDEXATOR is a metadata management tool that addresses the problems of organising, documenting, storing and sharing data in a research unit or infrastructure, and fits perfectly into a data management plan of a collective.
The central idea is that the storage space becomes the data repository, so the metadata should go to the data and not the other way around.
Given the diversity of domains, the approach chosen is to be both as flexible and as pragmatic as possible by allowing each collective to choose its own (controlled) vocabulary corresponding to the reality of its field and activities. The main idea is to be able to "capture" the user's metadata as easily as possible using their vocabulary. It is possible to define the whole terminology using a spreadsheet.
The choice was made for the JSON format, which is very appropriate for describing metadata, readable by both humans and machines.
This tool is built around a web interface coupled with a MongoDB database. The web interface allows you to i) Describe a dataset using metadata of various types (Description), ii) Search datasets by their metadata (Accessibility).
Dataset Catalogs as a Foundation for FAIR* DataTom Plasterer
BioPharma and the broader research community is faced with the challenge of simply finding the appropriate internal and external datasets for downstream analytics, knowledge-generation and collaboration. With datasets as the core asset, we wanted to promote both human and machine exploitability, using web-centric data cataloguing principles as described in the W3C Data on the Web Best Practices. To do so, we adopted DCAT (Data CATalog Vocabulary) and VoID (Vocabulary of Interlinked Datasets) for both RDF and non-RDF datasets at summary, version and distribution levels. Further, we’ve described datasets using a limited set of well-vetted public vocabularies, focused on cross-omics analytes and clinical features of the catalogued datasets.
... or how to query an RDF graph with 28 billion triples in a standard laptop
These slides correspond to my talk at the Stanford Center for Biomedical Informatics, on 25th April 2018
This is a short presentation about the FAIR Metrics Evaluator - software that automates the application of FAIR Metrics against a given resource, in order to determine its degree of "FAIRness"
An overview of the current functionality of the FAIR Evaluator - a framework for automating the evaluation of FAIRness of digital resources. The screenshots here are of the early strawman prototype, which is only available for use by the FAIR Metrics Authoring group at this time. Nevertheless, feedback on the functionality of the Evaluator would be welcome! We anticipate having a fully public version before August 2018.
This work is supported, in part, by the Ministerio de Economía y Competitividad grant number TIN2014-55993-RM
Quickly re-publish CSV/TSV files from existing repositories as FAIR Data with just a few mouse clicks!
You select the columns to "project" as Linked Data, and the associated ontology terms. The FAIR Projector Builder will create a FAIR Projector for you: a Triple Pattern Fragment server to provide the Linked Data; a published DCAT Distribution containing metadata about those triples and their source; and an RML model (syntactic and semantic of the triples, to aid in third-party discovery of this novel projection.
(current status - first prototype, not ready for public consumption)
-------
Thanks to the NBDC/DBCLS for sponsoring the hackathon series.
MDW also funded by Ministerio de Economía y Competitividad grant number TIN2014-55993-RM
smartAPIs: EUDAT Semantic Working Group Presentation @ RDA 9th PlenaryMark Wilkinson
smartAPIs are an approach to the incremental, machine-aided, semantic annotation of Web APIs. Starting from existing, popular standards, we will provide enhanced tools for authoring ever-richer metadata, guided by global community knowledge encapsulated in ontologies, and aided by "smart suggestions" based on mining the metadata from previous API specifications.
The project is led by Michel Dumontier (Maastricht University). This presentation was given on his behalf by Mark Wilkinson (UPM, Madrid; Spanish Ministerio de Economía y Competitividad grant number TIN2014-55993-R)
IBC FAIR Data Prototype Implementation slideshowMark Wilkinson
Discussion about ways of achieving FAIRness of both metadata and data. Brute force approaches, and more elegant "projection" approaches are shown.
Relevant papers are at:
doi: 10.7717/peerj-cs.110 (https://peerj.com/articles/cs-110/)
doi: 10.3389/fpls.2016.00641 (https://doi.org/10.3389/fpls.2016.00641)
Spanish Ministerio de Economía y Competitividad grant number TIN2014-55993-R
Building SADI Services Tutorial - SIB Workshop, Geneva, December 2015Mark Wilkinson
The primary slide deck for the SADI tutorial. We explain the motivation, simple SADI services, more complex SADI services, and then do a detailed walk-through of building a service, including the Perl service code and examples of service invocation at the command line, and using the SHARE client. You will want to look at the sample data/queries in this slide deck: http://www.slideshare.net/markmoby/sample-data-and-other-ur-ls-55737183 and the example service code in this slide deck: http://www.slideshare.net/markmoby/example-code-for-the-sadi-bmi-calculator-web-service?related=1
Example code for the SADI BMI Calculator Web ServiceMark Wilkinson
Two versions of the code for the SADI Web Service demonstrated at the Using the Semantic Web for faster (Bio-)Research workshop hosted by the Swiss Institute for Bioinformatics, Geneva, December, 2015. The first version of the code is a bare-bones service that consumes individuals with height and weight and returns individuals with a BMI. The second piece of code is functionally identical to the first, but highlights the small changes required to make the service a NanoPublisher (NanoPublishing services respond to Accept n-quads HTTP headers by returning NanoPublications, rather than just a stream of triples)
Perl code for a SADI service that calculates BMI. The first panel is the code for a traditional SADI service, the second panel highlights the minor changes required to convert the service into a service that outputs NanoPublications.
Luke McCarthy's tutorial - originally created for the CBRASS Project, funded by CANARIE.
The slideshow takes you though the design of a SADI Service, the considerations when creating service input and output classes (where DL reasoning is used for matchmaking), and how SADI fits with other initiatives such as SAWSDL
Presentation to the J. Craig Venter Institute, Dec. 2014Mark Wilkinson
This is largely a compilation of various other talks that I have posted here - a summary of the past 3+ years of work on SADI/SHARE. It includes the (now well-worn!!) slides about SHARE, as well as some of the more contemporary stuff about how we extended GALEN clinical classes with richer semantic descriptions, and then used them to do automated clinical phenotype analysis. Also includes the slide-deck related to automated Measurement Unit conversion (related to our work on semantically representing Framingham clinical risk assessment rules)
So... for anyone who regularly follows my uploads, there isn't much "new" in here, but at least it's all in one place now! :-)
Enhancing Reproducibility and Transparency in Clinical Research through Seman...Mark Wilkinson
We were interested in whether we could model well-established clinical risk guidelines in OWL, and use these to automatically classify patient data v.v. "risk" (e.g. using the Framingham risk categories). What we ended-up doing, however, was wandering down a very interesting path of attempting to model clinical intuition! This reports the first phase of the experiment. A subsequent SlideShare will give part II of this investigation.
This is the work of Soroush Samadian, Ph.D. Candidate at the University of British Columbia Bioinformatics Graduate Programme.
the same story as usual, but with a bit more context (why it is absolutely necessary to move science in this direction). Presented to University of Potsdam, Germany, and the University of New Brunswick, Canada in December, 2012.
This is a brief version of earlier talks, but I think it might explain more emphatically what I think Web Science is, and why I believe it is realistic, and how SADI/SHARE technologies (or technologies like them) are important to achieve the vision
Science in the Web, from hypothesis to result. Publishing in silico experiments IN the Web allows us to immediately and precisely disseminate new knowledge that can affect other Web Science experiments. This is the "singularity" where a new discovery is immediately put into practice
Evaluating Hypotheses using SPARQL-DL as an abstract workflow language to cho...Mark Wilkinson
Some of the recent work we've been doing with SADI and SHARE, using SHARE as a mechanism for dynamically converting OWL Classes into workflows in a data-dependent manner; OWL, in this case, is acting as an abstract workflow model. The slides in the middle are the usual SADI/SHARE explanation; the slides at the end show how we're using these dynamically generated workflows to "personalize" medical information on the Web for a particular patient's profile.
SWAT4LS 2011: SADI Knowledge Explorer Plug-inMark Wilkinson
my presentation of the SADI plug-in to the IO Informatics' Knowledge Explorer. Presented at SWAT4LS (Semantic Web Applications and Tools for Life Sciences), London, UK, December, 2011. It describes how we resolve identifiers to semantic metadata in a variety of ways in order to boot-strap the semantics required to do service discovery and matching. It also describes how we convert OWL classes into approximately matching SPARQL queries, and store these queries in the SADI registry such that, after service discovery, it is simple to extract the data a service requires as its input.
SADI in Perl - Protege Plugin Tutorial (fixed Aug 24, 2011)Mark Wilkinson
IMPORTANT CORRECTION TO THIS SLIDESHOW WAS MADE August 24, 2011. How to use the Protege SADI plugin to generate SADI-compliant semantic web services. Created for the 2011 DBCLS BioHackathon. Credits to Mark Wilkinson, Benjamin Vandervalk, Luke McCarthy, Edward Kawas.
Multi-cluster Kubernetes Networking- Patterns, Projects and GuidelinesSanjeev Rampal
Talk presented at Kubernetes Community Day, New York, May 2024.
Technical summary of Multi-Cluster Kubernetes Networking architectures with focus on 4 key topics.
1) Key patterns for Multi-cluster architectures
2) Architectural comparison of several OSS/ CNCF projects to address these patterns
3) Evolution trends for the APIs of these projects
4) Some design recommendations & guidelines for adopting/ deploying these solutions.
# Internet Security: Safeguarding Your Digital World
In the contemporary digital age, the internet is a cornerstone of our daily lives. It connects us to vast amounts of information, provides platforms for communication, enables commerce, and offers endless entertainment. However, with these conveniences come significant security challenges. Internet security is essential to protect our digital identities, sensitive data, and overall online experience. This comprehensive guide explores the multifaceted world of internet security, providing insights into its importance, common threats, and effective strategies to safeguard your digital world.
## Understanding Internet Security
Internet security encompasses the measures and protocols used to protect information, devices, and networks from unauthorized access, attacks, and damage. It involves a wide range of practices designed to safeguard data confidentiality, integrity, and availability. Effective internet security is crucial for individuals, businesses, and governments alike, as cyber threats continue to evolve in complexity and scale.
### Key Components of Internet Security
1. **Confidentiality**: Ensuring that information is accessible only to those authorized to access it.
2. **Integrity**: Protecting information from being altered or tampered with by unauthorized parties.
3. **Availability**: Ensuring that authorized users have reliable access to information and resources when needed.
## Common Internet Security Threats
Cyber threats are numerous and constantly evolving. Understanding these threats is the first step in protecting against them. Some of the most common internet security threats include:
### Malware
Malware, or malicious software, is designed to harm, exploit, or otherwise compromise a device, network, or service. Common types of malware include:
- **Viruses**: Programs that attach themselves to legitimate software and replicate, spreading to other programs and files.
- **Worms**: Standalone malware that replicates itself to spread to other computers.
- **Trojan Horses**: Malicious software disguised as legitimate software.
- **Ransomware**: Malware that encrypts a user's files and demands a ransom for the decryption key.
- **Spyware**: Software that secretly monitors and collects user information.
### Phishing
Phishing is a social engineering attack that aims to steal sensitive information such as usernames, passwords, and credit card details. Attackers often masquerade as trusted entities in email or other communication channels, tricking victims into providing their information.
### Man-in-the-Middle (MitM) Attacks
MitM attacks occur when an attacker intercepts and potentially alters communication between two parties without their knowledge. This can lead to the unauthorized acquisition of sensitive information.
### Denial-of-Service (DoS) and Distributed Denial-of-Service (DDoS) Attacks
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024APNIC
Ellisha Heppner, Grant Management Lead, presented an update on APNIC Foundation to the PNG DNS Forum held from 6 to 10 May, 2024 in Port Moresby, Papua New Guinea.
1.Wireless Communication System_Wireless communication is a broad term that i...JeyaPerumal1
Wireless communication involves the transmission of information over a distance without the help of wires, cables or any other forms of electrical conductors.
Wireless communication is a broad term that incorporates all procedures and forms of connecting and communicating between two or more devices using a wireless signal through wireless communication technologies and devices.
Features of Wireless Communication
The evolution of wireless technology has brought many advancements with its effective features.
The transmitted distance can be anywhere between a few meters (for example, a television's remote control) and thousands of kilometers (for example, radio communication).
Wireless communication can be used for cellular telephony, wireless access to the internet, wireless home networking, and so on.
Bridging the Digital Gap Brad Spiegel Macon, GA Initiative.pptxBrad Spiegel Macon GA
Brad Spiegel Macon GA’s journey exemplifies the profound impact that one individual can have on their community. Through his unwavering dedication to digital inclusion, he’s not only bridging the gap in Macon but also setting an example for others to follow.
1. EU Lead
Mark Wilkinson
Fundacion BBVA Chair in Biological Informatics
Isaac Peral Distinguished Researcher, CBGP-UPM
USA Lead
Michel Dumontier
Associate Professor, Biomedical Informatics, Stanford
FAIRport Project Lead
Barend Mons
Professor, Leiden University Medical Centre
FAIRport Skunkworks
3. What is a FAIRport?
● Findable - (meta)data should be uniquely and persistently identifiable
● Accessible - identifiers should provide a mechanism for (meta)data
access, including authentication, access protocol, license, etc.
● Interoperable - (meta)data should be machine-accessible, using a
machine-parseable syntax and, where possible, shared common
vocabularies.
● Reusable - there should be sufficient machine-readable metadata that it is
possible to “integrate like-with-like”, and that component data objects can
be precisely and comprehensively cited post-integration.
4. “Skunkworks”
“...a group within an organization given a high
degree of autonomy and unhampered by
bureaucracy, tasked with working on advanced
or secret projects.” -- Wikipedia: http://en.wikipedia.org/wiki/Skunk_Works
5. “Skunkworks” FAIRport group
Objective (ongoing) - explore existing technologies and attempt to build
prototype FAIRport code components using, whenever possible, existing
standards. Once desirable FAIR behaviors have been achieved, hand-off
to a professional coding team to ensure production-quality outcomes.
● Self-selected “hackers”
● Self-identified tasks (next few slides)
● Led to a series of Web meetings, and a joint Hackathon, with
participants at venues in Netherlands and USA.
6. Typical Problem
I’m looking for microarray data of human liver cells on a
time-course following liver transplant.
What repositories *could* contain this data?
● GEO? EUDat? NPG Scientific Data?
● What fields in those repositories would I need to
search, using what vocabularies, to find what I
need?
7. “Skunkworks” - initial observations
There are a lot of repositories out there!
General Purpose: Dryad, EUDat, Figshare, DataVerse, etc.
Special Purpose: PDB, UniProt, NCBI, EnsEMBL
Lack of rich, machine-readable descriptions of the contents of these
repositories hinders us from (for example):
● knowing where we can look for certain types of data
● knowing if two repositories contain records about the same thing
● Cross-referencing or “joining” across repositories to integrate
disparate data about the same thing
● Knowing which repository I could/should deposit my data to (and how)
8. If we wanted to enable this kind of FAIR discovery and
integration over myriad repositories, what infrastructure
(existing/new) would we need?
Challenge
9. Task:
harmonized cross-repository meta-descriptors
Though self-selected as a FAIRport Skunkworks task, this significantly
overlaps with the Force11 Data Citation Implementation Working Group
Team 4 - “Common repository interfaces”.
...so we joined forces :-)
10. Exemplar use-cases:
A piece of software that can generate a “sensible” query form/interface for
any repository
A piece of software that can generate a “sensible” and comprehensive
data submission form for any repository
Task:
harmonized cross-repository meta-descriptors
11. Prior Art?
“DCAT is an RDF vocabulary designed to facilitate interoperability
between data catalogs published on the Web…. By using DCAT to
describe datasets in data catalogs, publishers increase discoverability
and enable applications easily to consume metadata from multiple
catalogs. It further enables decentralized publishing of catalogs and
facilitates federated dataset search across sites. Aggregated DCAT
metadata can serve as a manifest file to facilitate digital preservation.”
http://www.w3.org/TR/vocab-dcat/
W3C Recommendation 16 January 2014
DCAT Data Catalog Vocabulary
12. DCAT is an RDF Schema that defines core metadata elements describing
dataset collections and the datasets within those collections. e.g.
:dataset-001
a dcat:Dataset ;
dct:title "Imaginary dataset" ;
dcat:keyword "accountability","transparency" ,"payments" ;
dct:issued "2011-12-05"^^xsd:date ;
dct:modified "2011-12-05"^^xsd:date ;
dct:temporal <http://reference.data.gov.uk/id/quarter/2006-Q1> ;
dct:spatial <http://www.geonames.org/6695072> ;
dct:publisher :finance-ministry ;
dct:language <http://id.loc.gov/vocabulary/iso639-1/en> ;
dcat:distribution :dataset-001-csv ;
Prior Art?
DCAT Data Catalog Vocabulary
13. So the core metadata of a repository’s collections
could be described in DCAT...
14. So the core metadata of a repository’s collections
could be described in DCAT...
...if the repositories used DCAT…
15. So the core metadata of a repository’s collections
could be described in DCAT...
...if the repositories used DCAT…
...generally speaking, they don’t...
16. So the core metadata of a repository’s collections
could be described in DCAT...
...if the repositories used DCAT…
...generally speaking, they don’t...
...and we need more than just core metadata to enable
cross-repository search anyway…
17. So DCAT itself isn’t the solution to our problem
because, among other things, it does not
provide sufficiently rich descriptors
20. What exactly *is* our problem?
Data Record (e.g. XML, RDF)
Data Schema (e.g. XMLS, RDFS)
Defines
21. What exactly *is* our problem?
Data Record (e.g. XML, RDF)
Data Schema (e.g. XMLS, RDFS)
Metadata Record (e.g. DCAT-compliant RDF)
Defines
Describes
22. What exactly *is* our problem?
Data Record (e.g. XML, RDF)
Data Schema (e.g. XMLS, RDFS)
Metadata Record (e.g. DCAT-compliant RDF)
DCAT RDFS Schema
Defines
Describes
Defines
23. What exactly *is* our problem?
Data Record (e.g. XML, RDF)
Data Schema (e.g. XMLS, RDFS)
Metadata Record (e.g. DCAT-compliant RDF)
DCAT RDFS Schema
If everyone was using all elements of the DCAT schema
to define their core metadata
then (that part of) the problem would be solved at this point
24. What exactly *is* our problem?
Data Record (e.g. XML, RDF)
Data Schema (e.g. XMLS, RDFS)
Metadata Record (e.g. DCAT-compliant RDF)
DCAT RDFS Schema
If everyone was using all elements of the DCAT schema
to define their core metadata
then (that part of) the problem would be solved at this point
We could use THIS
25. What exactly *is* our problem?
Data Record (e.g. XML, RDF)
Data Schema (e.g. XMLS, RDFS)
Metadata Record (e.g. DCAT-compliant RDF)
DCAT RDFS Schema
If everyone was using all elements of the DCAT schema
to define their core metadata
then (that part of) the problem would be solved at this point
To build queries
about THIS
26. What exactly *is* our problem?
XML
Data Record
XMLS
Data Schema
DCAT RDF
Metadata Record
RDF
Data Record
RDFS
Data Schema
UniProt RDF
Metadata Record
ACEDB
Data Record
ACEDB
Data Schema
DragonDB Form
Metadata Record
DCAT
RDFS Schema
UniProt RDFS
MetadataSchema
DragonDB Form
Metadata Schema
REALITY
27. What exactly *is* our problem?
XML
Data Record
XMLS
Data Schema
DCAT RDF
Metadata Record
RDF
Data Record
RDFS
Data Schema
UniProt RDF
Metadata Record
ACEDB
Data Record
ACEDB
Data Schema
DragonDB Form
Metadata Record
DCAT
RDFS Schema
UniProt RDFS
MetadataSchema
DragonDB Form
Metadata Schema
Repositories don’t all use DCAT Schema
28. What exactly *is* our problem?
XML
Data Record
XMLS
Data Schema
DCAT RDF
Metadata Record
RDF
Data Record
RDFS
Data Schema
UniProt RDF
Metadata Record
ACEDB
Data Record
ACEDB
Data Schema
DragonDB Form
Metadata Record
DCAT
RDFS Schema
UniProt RDFS
MetadataSchema
DragonDB Form
Metadata Schema
Those that use DCAT Schema, use only parts of it
29. What exactly *is* our problem?
XML
Data Record
XMLS
Data Schema
DCAT RDF
Metadata Record
RDF
Data Record
RDFS
Data Schema
UniProt RDF
Metadata Record
ACEDB
Data Record
ACEDB
Data Schema
DragonDB Form
Metadata Record
DCAT
RDFS Schema
UniProt RDFS
MetadataSchema
DragonDB Form
Metadata Schema
Those that don’t use DCAT
use a myriad of alternatives (some very loosely defined)
30. What exactly *is* our problem?
XML
Data Record
XMLS
Data Schema
DCAT RDF
Metadata Record
RDF
Data Record
RDFS
Data Schema
UniProt RDF
Metadata Record
ACEDB
Data Record
ACEDB
Data Schema
DragonDB Form
Metadata Record
DCAT
RDFS Schema
UniProt RDFS
MetadataSchema
DragonDB Form
Metadata Schema
And don’t necessarily use
all elements of those alternatives either
31. What exactly *is* our problem?
XML
Data Record
XMLS
Data Schema
DCAT RDF
Metadata Record
RDF
Data Record
RDFS
Data Schema
UniProt RDF
Metadata Record
ACEDB
Data Record
ACEDB
Data Schema
DragonDB Form
Metadata Record
DCAT
RDFS Schema
UniProt RDFS
MetadataSchema
DragonDB Form
Metadata Schema
So how are we going to do RICH queries over all
of these?
32. What exactly *is* our problem?
XML
Data Record
XMLS
Data Schema
DCAT RDF
Metadata Record
RDF
Data Record
RDFS
Data Schema
UniProt RDF
Metadata Record
ACEDB
Data Record
ACEDB
Data Schema
DragonDB Form
Metadata Record
DCAT
RDFS Schema
UniProt RDFS
MetadataSchema
DragonDB Form
Metadata Schema
We need a way to describe the descriptors...
33. The DCAT WG suggested the same thing
They said there was a need for “DCAT Profiles”
A DCAT Profile is a specification for data catalogs that adds additional
constraints to DCAT. Additional constraints in a profile MAY include:
● A minimum set of required metadata fields
● Classes and properties for additional metadata fields not covered in DCAT
● Controlled vocabularies or URI sets as acceptable values for properties
● Requirements for specific access mechanisms (RDF syntaxes, protocols) to the catalog's RDF
description
http://www.w3.org/TR/vocab-dcat/
34. The DCAT WG suggested the same thing
They said there was a need for “DCAT Profiles”
A DCAT Profile is a specification for data catalogs that adds additional
constraints to DCAT. Additional constraints in a profile MAY include:
● A minimum set of required metadata fields
● Classes and properties for additional metadata fields not covered in DCAT
● Controlled vocabularies or URI sets as acceptable values for properties
● Requirements for specific access mechanisms (RDF syntaxes, protocols) to the catalog's RDF
description
http://www.w3.org/TR/vocab-dcat/
A DCAT Profile is:
A generic way to describe what metadata fields a repository has
and what the constraints on those fields are
35. But the DCAT WG also suggested...
A DCAT Profile is a specification for data catalogs that adds additional
constraints to DCAT. Additional constraints in a profile MAY include:
● A minimum set of required metadata fields
● Classes and properties for additional metadata fields not covered in DCAT
● Controlled vocabularies or URI sets as acceptable values for properties
● Requirements for specific access mechanisms (RDF syntaxes, protocols) to the catalog's RDF
description
DCAT Profiles don’t exist!
http://www.w3.org/TR/vocab-dcat/
36. “FAIR Profiles”
At the Hackathon, the “Skunkers” decided to invent the DCAT Profile technology.
Since they are intended to allow descriptions of
● Descriptor metadata fields not included in DCAT...
● ...in many cases, Descriptors with ZERO metadata fields from DCAT...
● ...and in many cases, Descriptors that are not even in RDF...
We call them “FAIR Profiles” rather than DCAT profiles
(However, clear acknowledgements to the
DCAT Working Group for conceiving of the idea!)
37. XML
Data Record
XMLS
Data Schema
DCAT RDF
Metadata Record
RDF
Data Record
RDFS
Data Schema
UniProt RDF
Metadata Record
ACEDB
Data Record
ACEDB
Data Schema
DragonDB Form
Metadata Record
DCAT
RDFS Schema
UniProt RDFS
MetadataSchema
DragonDB Form
Metadata Schema
What the FAIR profile technology accomplishes
38. XML
Data Record
XMLS
Data Schema
DCAT RDF
Metadata Record
RDF
Data Record
RDFS
Data Schema
UniProt RDF
Metadata Record
ACEDB
Data Record
ACEDB
Data Schema
DragonDB Form
Metadata Record
DCAT
RDFS Schema
UniProt RDFS
MetadataSchema
DragonDB Form
Metadata Schema
FAIR Profile
DCAT Schema
FAIR Profile
UniProt Metadata
Schema
FAIR Profile
DragonDB Metadata
Schema
What the FAIR profile technology accomplishes
39. XML
Data Record
XMLS
Data Schema
DCAT RDF
Metadata Record
RDF
Data Record
RDFS
Data Schema
UniProt RDF
Metadata Record
ACEDB
Data Record
ACEDB
Data Schema
DragonDB Form
Metadata Record
DCAT
RDFS Schema
UniProt RDFS
MetadataSchema
DragonDB Form
Metadata Schema
FAIR Profile
DCAT Schema
FAIR Profile
UniProt Metadata
Schema
FAIR Profile
DragonDB Metadata
Schema
Though they are potentially describing very different things
(from Web FORM fields to OWL Ontologies!)
all FAIR Profiles are written using the same vocabulary and structure, defined by...
40. XML
Data Record
XMLS
Data Schema
DCAT RDF
Metadata Record
RDF
Data Record
RDFS
Data Schema
UniProt RDF
Metadata Record
ACEDB
Data Record
ACEDB
Data Schema
DragonDB Form
Metadata Record
DCAT
RDFS Schema
UniProt RDFS
MetadataSchema
DragonDB Form
Metadata Schema
FAIR Profile of
DCAT Schema
FAIR Profile of
UniProt Metadata
Schema
FAIR Profile of
DragonDB Metadata
Schema
42. Repo. Data Record (e.g. XML, RDF)
Repo. Data Schema (e.g. XMLS, RDFS)
Repository Metadata Record
Repository Metadata Schema
Defines
Describes
Defines
Defines
Describes
Repository’s Fair Profile
Fair Profile Schema
43. “All problems in computer
science can be solved by
another level of indirection”
-- David Wheeler
inventor of the subroutine
44. "...But that usually will create
another problem."
-- David Wheeler
“All problems in computer
science can be solved by
another level of indirection”
-- David Wheeler
inventor of the subroutine
Diomidis Spinellis. Another level of indirection. In Andy Oram and Greg Wilson, editors, Beautiful Code: Leading Programmers Explain How They Think, chapter 17, pages 279–
291. O'Reilly and Associates, Sebastopol, CA, 2007.
45. Desiderata for FAIR Profile Schema
● Must describe legacy data (i.e. not just DCAT or other “modern” data)
● Must describe a multitude of data formats (XML, RDF, Key/Value, etc.)
● Must be capable of describing OWL-DL-governed data (still rare, but
increasingly used… Classes, property-restrictions, etc.)
● Must be capable of describing any kind of value constraint, e.g. arbitrary CV,
rdf:range, or equivalent OWL construct
● Must be hierarchical (i.e. the value-constraint of a field can be set as an
entirely separate FAIR Profile)
● Must be modular, identifiable, shareable, and reusable (to stem the
proliferation of new formats)
● Must use standard technologies, and re-use existing vocabularies if poss.
● Must be extremely lightweight
● Must NOT require the participation of the repository host (no buy-in required)
46. FAIR Profile Schema
A very lightweight meta-meta-descriptor, in RDFS language
FAIR Profile FP Class FP Property
Property
Restriction
Definition
hasClass hasProperty allowed
Values
classType propertyType
External Ontology
or RDFS Class
(optional)
External Ontology
or RDFS Predicate
(optional)
http://github.com/DataFairPort/DataFairPort/blob/Master/Schema/DCATProfile.rdfs
47. FAIR Profile Schema
A very lightweight meta-meta-descriptor, in RDFS language
FAIR Profile FP Class FP Property
Property
Restriction
Definition
hasClass hasProperty allowed
Values
classType propertyType
External Ontology
or RDFS Class
(optional)
External Ontology
or RDFS Predicate
(optional)
Requirement Status?
Cardinality?
Other Constraint?
http://github.com/DataFairPort/DataFairPort/blob/Master/Schema/DCATProfile.rdfs
49. Property Restriction
Definition
(XSD, FAIR Profile, SKOS)
Describes the constraints on the possible
values for a predicate in the target-
Repository’s metadata Schema
NOTE: we cannot use rdfs:range because
we are meta-modelling! The predicate is a
CLASS at the meta-model level, so use of
rdfs:range is not appropriate.
50. Property Restriction
Definition
(XSD, FAIR Profile, SKOS)
Describes the constraints on the possible
values for a predicate in the target-
Repository’s metadata Schema
The possible values are:
● An XSD Datatype
● Another DCAT Profile (i.e. hierarchical profiles)
● A SKOS View on a set of ontology terms from
one or more ontologies
51. A FAIR Profile
(an RDF document that follows the FAIR Profile Schema)
This!
Metadata Record (e.g. DCAT-compliant RDF)
DCAT RDFS Schema
Fair Profile
Fair Profile Schema
52. A FAIR Profile
FAIR Profile FP Class FP Property
Property
Restriction
Definition
hasClass hasProperty allowed
Values
classType propertyType
External Class External Predicate
53. A FAIR Profile
FAIR Profile FP Class FP Property
Property
Restriction
Definition
hasClass hasProperty allowed
Values
classType propertyType
External Class External Predicate
FAIR Profiles are FAIR!
(Identifiable, Re-usable, and Shareable)
54. A FAIR Profile
FAIR Profile FP Class FP Property
Property
Restriction
Definition
hasClass hasProperty allowed
Values
classType propertyType
External Class External Predicate
55. A FAIR Profile
FAIR Profile FP Class FP Property
Property
Restriction
Definition
hasClass hasProperty allowed
Values
classType propertyType
External Class External Predicate
56. A FAIR Profile
The CoreMicroarrayDistributionMetadata Descriptor Class
FAIR Profile FP Class FP Property
hasClass hasProperty allowed
Values
classType propertyType
External Class External Predicate
Property
Restriction
Definition
58. CoreMicroarrayDistributionMetadata
Descriptor
The Class follows the “DCAT Distribution” Class model
FAIR Profile FP Class FP Property
hasClass hasProperty allowed
Values
classType propertyType
External Class External Predicate
Property
Restriction
Definition
59. CoreMicroarrayDistributionMetadata
Descriptor
It uses only 3 properties from the “DCAT Distribution” Class model
FAIR Profile FP Class FP Property
hasClass
hasProperty
allowed
Values
propertyType
External Class External Predicate
Property
Restriction
Definition
classType
60. CoreMicroarrayDistributionMetadata
Descriptor: Property #1
It uses only 3 properties from the “DCAT Distribution” Class model
...let’s look at one of them in detail
FAIR Profile FP Class FP Property
hasClass
hasProperty
allowed
Values
propertyType
External Class External Predicate
classType
Property
Restriction
Definition
61. CoreMicroarrayDistributionMetadata
Descriptor: Property #1
This Meta-Descriptor element is a ‘FAIR Profile Property’ Class
FAIR Profile FP Class FP Property
hasClass hasProperty allowed
Values
classType propertyType
External Class External Predicate
Property
Restriction
Definition
62. CoreMicroarrayDistributionMetadata
Descriptor: Property #1
This is it’s label within that organizations metadata descriptor
FAIR Profile FP Class FP Property
hasClass hasProperty allowed
Values
classType propertyType
External Class External Predicate
Property
Restriction
Definition
63. CoreMicroarrayDistributionMetadata
Descriptor: Property #1
This is the URL of the Predicate used by that descriptor
FAIR Profile FP Class FP Property
hasClass hasProperty allowed
Values
classType
propertyType
External Class External Predicate
Property
Restriction
Definition
64. CoreMicroarrayDistributionMetadata
Descriptor: Property #1
This is the “range” of that Predicate within the organizations descriptor
FAIR Profile FP Class FP Property
hasClass hasProperty allowed
Values
classType
External Class External Predicate
Property
Restriction
Definition
propertyType
65. CoreMicroarrayDistributionMetadata
Descriptor: Property #2
Let’s look at a different property from the
CoreMicroarrayDistributionMetadata Class
FAIR Profile FP Class FP Property
hasClass
hasProperty
allowed
Values
propertyType
External Class External Predicate
classType
Property
Restriction
Definition
67. CoreMicroarrayDistributionMetadata
Descriptor: Property #2
This is the label for that property
FAIR Profile FP Class FP Property
hasClass hasProperty allowed
Values
classType propertyType
External Class External Predicate
Property
Restriction
Definition
68. CoreMicroarrayDistributionMetadata
Descriptor: Property #2
The URL of the predicate of this Property
FAIR Profile FP Class FP Property
hasClass hasProperty allowed
Values
classType
propertyType
External Class External Predicate
Property
Restriction
Definition
69. CoreMicroarrayDistributionMetadata
Descriptor: Property #2
In the Metadata Descriptor, this property is constrained
by the set of ontology terms defined in the SKOS Concept Scheme
EDAM_Microarray_Data_Format
FAIR Profile FP Class FP Property
hasClass hasProperty allowed
Values
classType
External Class External Predicate
Property
Restriction
Definition
propertyType
70. <rdf:Description xmlns:ns1="http://www.w3.org/2002/07/owl#"
rdf:about="http://biordf.org/DataFairPort/ConceptSchemes/EDAM_Microarray_Data_Format">
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Ontology"/>
<rdf:type rdf:resource="http://www.w3.org/2004/02/skos/core#ConceptScheme"/>
<ns1:imports rdf:resource="http://purl.bioontology.org/ontology/EDAM"/>
</rdf:Description>
<rdf:Description
xmlns:ns1="http://www.w3.org/2000/01/rdf-schema#"
xmlns:ns2="http://www.w3.org/2004/02/skos/core#"
rdf:about="http://edamontology.org/format_1641">
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
<rdf:type rdf:resource="http://www.w3.org/2004/02/skos/core#Concept"/>
<ns1:label>affymetrix-exp</ns1:label>
<ns2:broader rdf:resource="http://edamontology.org/format_2056"/>
<ns2:inScheme rdf:resource="http://biordf.org/DataFairPort/ConceptSchemes/EDAM_Microarray_Data_Format"/>
</rdf:Description>
<rdf:Description
xmlns:ns1="http://www.w3.org/2000/01/rdf-schema#"
xmlns:ns2="http://www.w3.org/2004/02/skos/core#"
rdf:about="http://edamontology.org/format_2056">
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/>
<rdf:type rdf:resource="http://www.w3.org/2004/02/skos/core#Concept"/>
<ns1:label>Microarray experiment data format</ns1:label>
<ns2:broader rdf:resource="http://biordf.org/DataFairPort/ConceptSchemes/EDAM_Microarray_Data_Format"/>
<ns2:inScheme rdf:resource="http://biordf.org/DataFairPort/ConceptSchemes/EDAM_Microarray_Data_Format"/>
</rdf:Description>
http://biordf.org/DataFairPort/ConceptSchemes/EDAM_Microarray_Data_Format
This is a “SKOSified” view of the EDAM Ontology
Jupp, et al., “Taking a view on bio-ontologies” ceur-ws.org/Vol-897/session4-paper22.pdf
71. A DCAT Profile
Return to the very top of our FAIR Profile
Follow the ExtendedAuthorship Class
FAIR Profile FP Class FP Property
Property
Restriction
Definition
hasClass hasProperty allowed
Values
classType propertyType
External Class External Predicate
72. ExtendedAuthorship
Follow one of the properties of the ExtendedAuthorship Class
FAIR Profile FP Class FP Property
hasClass
hasProperty
allowed
Values
propertyType
External Class External Predicate
classType
Property
Restriction
Definition
74. Author ORCID
The allowed values of this Property are constrained to be
individuals that follow the FAIR Profile Schema “DemoORCIDProfileScheme”
FAIR Profile FP Class FP Property
hasClass hasProperty allowed
Values
classType
External Class External Predicate
Property
Restriction
Definition
propertyType
76. http://biordf.org/DataFairPort/ProfileSchemas/DemoORCIDProfileScheme.rdf
FAIR Profile FP Class FP Property
hasClass hasProperty allowed
Values
classType
External Class External Predicate
propertyType
This is parsed in exactly the same way as our original
DemoMicroarrayProfileScheme, but is embedded within
it as the value of the author_ORCID property.
…Arbitrary, hierarchical layers of complexity…
FAIR Profile FP Class
hasClass hasProperty
classType
External Class
77. So to build an interface
(e.g. query or data-capture)
from a FAIR Profile:
[1] Parse all FAIR Profile classes
Parse the properties of each class
Determine the target predicate
Determine the target value-restrictions
Call [1] if restriction is a FAIR
Profile
Create a metadata [capture/query] facet with
that
predicate and that restriction
78.
79. DCAT Profile Class #1
DCAT Profile Class #2
DCAT Profile Class #3
DCAT Profile
Class #4 (embedded)
Value
constraints
Descriptor-specific labels associated
with ontology predicates (if applicable)
“Classes” may be associated with an ontology
to allow reasoning, or may just represent an
“arbitrary” grouping of properties within the
Target metadata descriptor
Metadata Descriptor-specific details are captured
e.g. this field is required by this target Metadata Descriptor
80. Other features of FAIR profiles
● Do not require repository participation
● Provides a purpose-driven, potentially non-comprehensive “view” on a
repository, of which there may be many, according to what the profile
author needs to cross-query
● Profiles of any given repository facet are not required to be identical! e.g.
A different profile might utilize a different controlled vocabulary over any
given facet (e.g. a freetext facet)
● Anybody can define a profile (of course, the profile defined by the
repository owner should be considered “canonical”... the rest are just
purpose-built “best-guesses”)
● FAIR profiles can/should be indexed and shared, to facilitate cross-
repository interoperability and integration
● There is no (obvious) reason why a FAIR profile could not be used to
describe the DATA in the repository, not just the metadata...
81. Nothin’ ain’t worth nothin’, but it’s free!
-- Kris Kristofferson
“All problems in computer
science can be solved by
another level of indirection
...But that usually will create
another problem."
-- David Wheeler
82. Nothin’ ain’t worth nothin’, but it’s free!
The FAIR profile isn’t “a magic bean”!
It DOES NOT ACCOMPLISH SEMANTIC MAPPING
between one field in one repository, and a semantically-
related field in another repository
83. Nothin’ ain’t worth nothin’, but it’s free!
The FAIR profile isn’t “a magic bean”!
It does give us a standard way to identify, describe, and
meta-link these fields, and a predictable place where a
mapping mechanism could be injected.
84. Nothin’ ain’t worth nothin’, but it’s free!
The FAIR profile isn’t “a magic bean”!
...we don’t inject it (yet!) because that would require
invention of yet another “standard”, and we want to avoid
that if possible!
85. Nothin’ ain’t worth nothin’, but it’s free!
The FAIR profile isn’t “a magic bean”!
There may be some in the audience who, like me,
recognize that this problem is nearly identical to the
problem faced by the WSDL -> SAWSDL community.
I will be looking at their solution for guidance in the next
phase of FAIR Profiles...
… so we still have problems, but at least they are now
re-defined as problems for which there are solutions!
86. Skunkworks Participants
● Mark Wilkinson
● Michel Dumontier
● Barend Mons
● Tim Clark
● Jun Zhao
● Paolo Ciccarese
● Paul Groth
● Erik van Mulligen
● Luiz Olavo Bonino da
Silva Santos
● Matthew Gamble
● Carole Goble
● Joël Kuiper
● Morris Swertz
● Erik Schultes
● Erik Schultes
● Mercè Crosas
● Adrian Garcia
● Barend Mons
● Philip Durbin
● Jeffrey Grethe
● Katy Wolstencroft
● Sudeshna Das
● M. Emily Merrill
87. Post-presentation comments
We should look at ISO 11179 -> are we
duplicating those efforts or are we creating
something that is an implementation of those
efforts?
See also Dublin Core’s similar initiative.