Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Force11 JDDCP workshop presentation, @ Force2015, Oxford

1,013 views

Published on

Presentation of the early prototype of "FAIR Profiles" - an example of the proposed DCAT Profile, proposed by the DCAT working group (but AFAIK never implemented). This prototype emerged from the activity of the "Skunkworks" group, from the Data FAIRport project.

Published in: Internet

Force11 JDDCP workshop presentation, @ Force2015, Oxford

  1. 1. EU Lead Mark Wilkinson Fundacion BBVA Chair in Biological Informatics Isaac Peral Distinguished Researcher, CBGP-UPM USA Lead Michel Dumontier Associate Professor, Biomedical Informatics, Stanford FAIRport Project Lead Barend Mons Professor, Leiden University Medical Centre FAIRport Skunkworks
  2. 2. “Skunkworks” Team Update Objectives and Outcomes (...so far...)
  3. 3. What is a FAIRport? ● Findable - (meta)data should be uniquely and persistently identifiable ● Accessible - identifiers should provide a mechanism for (meta)data access, including authentication, access protocol, license, etc. ● Interoperable - (meta)data should be machine-accessible, using a machine-parseable syntax and, where possible, shared common vocabularies. ● Reusable - there should be sufficient machine-readable metadata that it is possible to “integrate like-with-like”, and that component data objects can be precisely and comprehensively cited post-integration.
  4. 4. “Skunkworks” “...a group within an organization given a high degree of autonomy and unhampered by bureaucracy, tasked with working on advanced or secret projects.” -- Wikipedia: http://en.wikipedia.org/wiki/Skunk_Works
  5. 5. “Skunkworks” FAIRport group Objective (ongoing) - explore existing technologies and attempt to build prototype FAIRport code components using, whenever possible, existing standards. Once desirable FAIR behaviors have been achieved, hand-off to a professional coding team to ensure production-quality outcomes. ● Self-selected “hackers” ● Self-identified tasks (next few slides) ● Led to a series of Web meetings, and a joint Hackathon, with participants at venues in Netherlands and USA.
  6. 6. Typical Problem I’m looking for microarray data of human liver cells on a time-course following liver transplant. What repositories *could* contain this data? ● GEO? EUDat? NPG Scientific Data? ● What fields in those repositories would I need to search, using what vocabularies, to find what I need?
  7. 7. “Skunkworks” - initial observations There are a lot of repositories out there! General Purpose: Dryad, EUDat, Figshare, DataVerse, etc. Special Purpose: PDB, UniProt, NCBI, EnsEMBL Lack of rich, machine-readable descriptions of the contents of these repositories hinders us from (for example): ● knowing where we can look for certain types of data ● knowing if two repositories contain records about the same thing ● Cross-referencing or “joining” across repositories to integrate disparate data about the same thing ● Knowing which repository I could/should deposit my data to (and how)
  8. 8. If we wanted to enable this kind of FAIR discovery and integration over myriad repositories, what infrastructure (existing/new) would we need? Challenge
  9. 9. Task: harmonized cross-repository meta-descriptors Though self-selected as a FAIRport Skunkworks task, this significantly overlaps with the Force11 Data Citation Implementation Working Group Team 4 - “Common repository interfaces”. ...so we joined forces :-)
  10. 10. Exemplar use-cases: A piece of software that can generate a “sensible” query form/interface for any repository A piece of software that can generate a “sensible” and comprehensive data submission form for any repository Task: harmonized cross-repository meta-descriptors
  11. 11. Prior Art? “DCAT is an RDF vocabulary designed to facilitate interoperability between data catalogs published on the Web…. By using DCAT to describe datasets in data catalogs, publishers increase discoverability and enable applications easily to consume metadata from multiple catalogs. It further enables decentralized publishing of catalogs and facilitates federated dataset search across sites. Aggregated DCAT metadata can serve as a manifest file to facilitate digital preservation.” http://www.w3.org/TR/vocab-dcat/ W3C Recommendation 16 January 2014 DCAT Data Catalog Vocabulary
  12. 12. DCAT is an RDF Schema that defines core metadata elements describing dataset collections and the datasets within those collections. e.g. :dataset-001 a dcat:Dataset ; dct:title "Imaginary dataset" ; dcat:keyword "accountability","transparency" ,"payments" ; dct:issued "2011-12-05"^^xsd:date ; dct:modified "2011-12-05"^^xsd:date ; dct:temporal <http://reference.data.gov.uk/id/quarter/2006-Q1> ; dct:spatial <http://www.geonames.org/6695072> ; dct:publisher :finance-ministry ; dct:language <http://id.loc.gov/vocabulary/iso639-1/en> ; dcat:distribution :dataset-001-csv ; Prior Art? DCAT Data Catalog Vocabulary
  13. 13. So the core metadata of a repository’s collections could be described in DCAT...
  14. 14. So the core metadata of a repository’s collections could be described in DCAT... ...if the repositories used DCAT…
  15. 15. So the core metadata of a repository’s collections could be described in DCAT... ...if the repositories used DCAT… ...generally speaking, they don’t...
  16. 16. So the core metadata of a repository’s collections could be described in DCAT... ...if the repositories used DCAT… ...generally speaking, they don’t... ...and we need more than just core metadata to enable cross-repository search anyway…
  17. 17. So DCAT itself isn’t the solution to our problem because, among other things, it does not provide sufficiently rich descriptors
  18. 18. What exactly *is* our problem?
  19. 19. What exactly *is* our problem? Data Record (e.g. XML, RDF)
  20. 20. What exactly *is* our problem? Data Record (e.g. XML, RDF) Data Schema (e.g. XMLS, RDFS) Defines
  21. 21. What exactly *is* our problem? Data Record (e.g. XML, RDF) Data Schema (e.g. XMLS, RDFS) Metadata Record (e.g. DCAT-compliant RDF) Defines Describes
  22. 22. What exactly *is* our problem? Data Record (e.g. XML, RDF) Data Schema (e.g. XMLS, RDFS) Metadata Record (e.g. DCAT-compliant RDF) DCAT RDFS Schema Defines Describes Defines
  23. 23. What exactly *is* our problem? Data Record (e.g. XML, RDF) Data Schema (e.g. XMLS, RDFS) Metadata Record (e.g. DCAT-compliant RDF) DCAT RDFS Schema If everyone was using all elements of the DCAT schema to define their core metadata then (that part of) the problem would be solved at this point
  24. 24. What exactly *is* our problem? Data Record (e.g. XML, RDF) Data Schema (e.g. XMLS, RDFS) Metadata Record (e.g. DCAT-compliant RDF) DCAT RDFS Schema If everyone was using all elements of the DCAT schema to define their core metadata then (that part of) the problem would be solved at this point We could use THIS
  25. 25. What exactly *is* our problem? Data Record (e.g. XML, RDF) Data Schema (e.g. XMLS, RDFS) Metadata Record (e.g. DCAT-compliant RDF) DCAT RDFS Schema If everyone was using all elements of the DCAT schema to define their core metadata then (that part of) the problem would be solved at this point To build queries about THIS
  26. 26. What exactly *is* our problem? XML Data Record XMLS Data Schema DCAT RDF Metadata Record RDF Data Record RDFS Data Schema UniProt RDF Metadata Record ACEDB Data Record ACEDB Data Schema DragonDB Form Metadata Record DCAT RDFS Schema UniProt RDFS MetadataSchema DragonDB Form Metadata Schema REALITY
  27. 27. What exactly *is* our problem? XML Data Record XMLS Data Schema DCAT RDF Metadata Record RDF Data Record RDFS Data Schema UniProt RDF Metadata Record ACEDB Data Record ACEDB Data Schema DragonDB Form Metadata Record DCAT RDFS Schema UniProt RDFS MetadataSchema DragonDB Form Metadata Schema Repositories don’t all use DCAT Schema
  28. 28. What exactly *is* our problem? XML Data Record XMLS Data Schema DCAT RDF Metadata Record RDF Data Record RDFS Data Schema UniProt RDF Metadata Record ACEDB Data Record ACEDB Data Schema DragonDB Form Metadata Record DCAT RDFS Schema UniProt RDFS MetadataSchema DragonDB Form Metadata Schema Those that use DCAT Schema, use only parts of it
  29. 29. What exactly *is* our problem? XML Data Record XMLS Data Schema DCAT RDF Metadata Record RDF Data Record RDFS Data Schema UniProt RDF Metadata Record ACEDB Data Record ACEDB Data Schema DragonDB Form Metadata Record DCAT RDFS Schema UniProt RDFS MetadataSchema DragonDB Form Metadata Schema Those that don’t use DCAT use a myriad of alternatives (some very loosely defined)
  30. 30. What exactly *is* our problem? XML Data Record XMLS Data Schema DCAT RDF Metadata Record RDF Data Record RDFS Data Schema UniProt RDF Metadata Record ACEDB Data Record ACEDB Data Schema DragonDB Form Metadata Record DCAT RDFS Schema UniProt RDFS MetadataSchema DragonDB Form Metadata Schema And don’t necessarily use all elements of those alternatives either
  31. 31. What exactly *is* our problem? XML Data Record XMLS Data Schema DCAT RDF Metadata Record RDF Data Record RDFS Data Schema UniProt RDF Metadata Record ACEDB Data Record ACEDB Data Schema DragonDB Form Metadata Record DCAT RDFS Schema UniProt RDFS MetadataSchema DragonDB Form Metadata Schema So how are we going to do RICH queries over all of these?
  32. 32. What exactly *is* our problem? XML Data Record XMLS Data Schema DCAT RDF Metadata Record RDF Data Record RDFS Data Schema UniProt RDF Metadata Record ACEDB Data Record ACEDB Data Schema DragonDB Form Metadata Record DCAT RDFS Schema UniProt RDFS MetadataSchema DragonDB Form Metadata Schema We need a way to describe the descriptors...
  33. 33. The DCAT WG suggested the same thing They said there was a need for “DCAT Profiles” A DCAT Profile is a specification for data catalogs that adds additional constraints to DCAT. Additional constraints in a profile MAY include: ● A minimum set of required metadata fields ● Classes and properties for additional metadata fields not covered in DCAT ● Controlled vocabularies or URI sets as acceptable values for properties ● Requirements for specific access mechanisms (RDF syntaxes, protocols) to the catalog's RDF description http://www.w3.org/TR/vocab-dcat/
  34. 34. The DCAT WG suggested the same thing They said there was a need for “DCAT Profiles” A DCAT Profile is a specification for data catalogs that adds additional constraints to DCAT. Additional constraints in a profile MAY include: ● A minimum set of required metadata fields ● Classes and properties for additional metadata fields not covered in DCAT ● Controlled vocabularies or URI sets as acceptable values for properties ● Requirements for specific access mechanisms (RDF syntaxes, protocols) to the catalog's RDF description http://www.w3.org/TR/vocab-dcat/ A DCAT Profile is: A generic way to describe what metadata fields a repository has and what the constraints on those fields are
  35. 35. But the DCAT WG also suggested... A DCAT Profile is a specification for data catalogs that adds additional constraints to DCAT. Additional constraints in a profile MAY include: ● A minimum set of required metadata fields ● Classes and properties for additional metadata fields not covered in DCAT ● Controlled vocabularies or URI sets as acceptable values for properties ● Requirements for specific access mechanisms (RDF syntaxes, protocols) to the catalog's RDF description DCAT Profiles don’t exist! http://www.w3.org/TR/vocab-dcat/
  36. 36. “FAIR Profiles” At the Hackathon, the “Skunkers” decided to invent the DCAT Profile technology. Since they are intended to allow descriptions of ● Descriptor metadata fields not included in DCAT... ● ...in many cases, Descriptors with ZERO metadata fields from DCAT... ● ...and in many cases, Descriptors that are not even in RDF... We call them “FAIR Profiles” rather than DCAT profiles (However, clear acknowledgements to the DCAT Working Group for conceiving of the idea!)
  37. 37. XML Data Record XMLS Data Schema DCAT RDF Metadata Record RDF Data Record RDFS Data Schema UniProt RDF Metadata Record ACEDB Data Record ACEDB Data Schema DragonDB Form Metadata Record DCAT RDFS Schema UniProt RDFS MetadataSchema DragonDB Form Metadata Schema What the FAIR profile technology accomplishes
  38. 38. XML Data Record XMLS Data Schema DCAT RDF Metadata Record RDF Data Record RDFS Data Schema UniProt RDF Metadata Record ACEDB Data Record ACEDB Data Schema DragonDB Form Metadata Record DCAT RDFS Schema UniProt RDFS MetadataSchema DragonDB Form Metadata Schema FAIR Profile DCAT Schema FAIR Profile UniProt Metadata Schema FAIR Profile DragonDB Metadata Schema What the FAIR profile technology accomplishes
  39. 39. XML Data Record XMLS Data Schema DCAT RDF Metadata Record RDF Data Record RDFS Data Schema UniProt RDF Metadata Record ACEDB Data Record ACEDB Data Schema DragonDB Form Metadata Record DCAT RDFS Schema UniProt RDFS MetadataSchema DragonDB Form Metadata Schema FAIR Profile DCAT Schema FAIR Profile UniProt Metadata Schema FAIR Profile DragonDB Metadata Schema Though they are potentially describing very different things (from Web FORM fields to OWL Ontologies!) all FAIR Profiles are written using the same vocabulary and structure, defined by...
  40. 40. XML Data Record XMLS Data Schema DCAT RDF Metadata Record RDF Data Record RDFS Data Schema UniProt RDF Metadata Record ACEDB Data Record ACEDB Data Schema DragonDB Form Metadata Record DCAT RDFS Schema UniProt RDFS MetadataSchema DragonDB Form Metadata Schema FAIR Profile of DCAT Schema FAIR Profile of UniProt Metadata Schema FAIR Profile of DragonDB Metadata Schema
  41. 41. The FAIR Profile Schema (the thing the Skunkworks team invented)
  42. 42. Repo. Data Record (e.g. XML, RDF) Repo. Data Schema (e.g. XMLS, RDFS) Repository Metadata Record Repository Metadata Schema Defines Describes Defines Defines Describes Repository’s Fair Profile Fair Profile Schema
  43. 43. “All problems in computer science can be solved by another level of indirection” -- David Wheeler inventor of the subroutine
  44. 44. "...But that usually will create another problem." -- David Wheeler “All problems in computer science can be solved by another level of indirection” -- David Wheeler inventor of the subroutine Diomidis Spinellis. Another level of indirection. In Andy Oram and Greg Wilson, editors, Beautiful Code: Leading Programmers Explain How They Think, chapter 17, pages 279– 291. O'Reilly and Associates, Sebastopol, CA, 2007.
  45. 45. Desiderata for FAIR Profile Schema ● Must describe legacy data (i.e. not just DCAT or other “modern” data) ● Must describe a multitude of data formats (XML, RDF, Key/Value, etc.) ● Must be capable of describing OWL-DL-governed data (still rare, but increasingly used… Classes, property-restrictions, etc.) ● Must be capable of describing any kind of value constraint, e.g. arbitrary CV, rdf:range, or equivalent OWL construct ● Must be hierarchical (i.e. the value-constraint of a field can be set as an entirely separate FAIR Profile) ● Must be modular, identifiable, shareable, and reusable (to stem the proliferation of new formats) ● Must use standard technologies, and re-use existing vocabularies if poss. ● Must be extremely lightweight ● Must NOT require the participation of the repository host (no buy-in required)
  46. 46. FAIR Profile Schema A very lightweight meta-meta-descriptor, in RDFS language FAIR Profile FP Class FP Property Property Restriction Definition hasClass hasProperty allowed Values classType propertyType External Ontology or RDFS Class (optional) External Ontology or RDFS Predicate (optional) http://github.com/DataFairPort/DataFairPort/blob/Master/Schema/DCATProfile.rdfs
  47. 47. FAIR Profile Schema A very lightweight meta-meta-descriptor, in RDFS language FAIR Profile FP Class FP Property Property Restriction Definition hasClass hasProperty allowed Values classType propertyType External Ontology or RDFS Class (optional) External Ontology or RDFS Predicate (optional) Requirement Status? Cardinality? Other Constraint? http://github.com/DataFairPort/DataFairPort/blob/Master/Schema/DCATProfile.rdfs
  48. 48. Property Restriction Definition (XSD, FAIR Profile, SKOS) Describes the constraints on the possible values for a predicate in the target- Repository’s metadata Schema
  49. 49. Property Restriction Definition (XSD, FAIR Profile, SKOS) Describes the constraints on the possible values for a predicate in the target- Repository’s metadata Schema NOTE: we cannot use rdfs:range because we are meta-modelling! The predicate is a CLASS at the meta-model level, so use of rdfs:range is not appropriate.
  50. 50. Property Restriction Definition (XSD, FAIR Profile, SKOS) Describes the constraints on the possible values for a predicate in the target- Repository’s metadata Schema The possible values are: ● An XSD Datatype ● Another DCAT Profile (i.e. hierarchical profiles) ● A SKOS View on a set of ontology terms from one or more ontologies
  51. 51. A FAIR Profile (an RDF document that follows the FAIR Profile Schema) This! Metadata Record (e.g. DCAT-compliant RDF) DCAT RDFS Schema Fair Profile Fair Profile Schema
  52. 52. A FAIR Profile FAIR Profile FP Class FP Property Property Restriction Definition hasClass hasProperty allowed Values classType propertyType External Class External Predicate
  53. 53. A FAIR Profile FAIR Profile FP Class FP Property Property Restriction Definition hasClass hasProperty allowed Values classType propertyType External Class External Predicate FAIR Profiles are FAIR! (Identifiable, Re-usable, and Shareable)
  54. 54. A FAIR Profile FAIR Profile FP Class FP Property Property Restriction Definition hasClass hasProperty allowed Values classType propertyType External Class External Predicate
  55. 55. A FAIR Profile FAIR Profile FP Class FP Property Property Restriction Definition hasClass hasProperty allowed Values classType propertyType External Class External Predicate
  56. 56. A FAIR Profile The CoreMicroarrayDistributionMetadata Descriptor Class FAIR Profile FP Class FP Property hasClass hasProperty allowed Values classType propertyType External Class External Predicate Property Restriction Definition
  57. 57. CoreMicroarrayDistributionMetadata Class Descriptor FAIR Profile FP Class FP Property hasClass hasProperty allowed Values classType propertyType External Class External Predicate Property Restriction Definition
  58. 58. CoreMicroarrayDistributionMetadata Descriptor The Class follows the “DCAT Distribution” Class model FAIR Profile FP Class FP Property hasClass hasProperty allowed Values classType propertyType External Class External Predicate Property Restriction Definition
  59. 59. CoreMicroarrayDistributionMetadata Descriptor It uses only 3 properties from the “DCAT Distribution” Class model FAIR Profile FP Class FP Property hasClass hasProperty allowed Values propertyType External Class External Predicate Property Restriction Definition classType
  60. 60. CoreMicroarrayDistributionMetadata Descriptor: Property #1 It uses only 3 properties from the “DCAT Distribution” Class model ...let’s look at one of them in detail FAIR Profile FP Class FP Property hasClass hasProperty allowed Values propertyType External Class External Predicate classType Property Restriction Definition
  61. 61. CoreMicroarrayDistributionMetadata Descriptor: Property #1 This Meta-Descriptor element is a ‘FAIR Profile Property’ Class FAIR Profile FP Class FP Property hasClass hasProperty allowed Values classType propertyType External Class External Predicate Property Restriction Definition
  62. 62. CoreMicroarrayDistributionMetadata Descriptor: Property #1 This is it’s label within that organizations metadata descriptor FAIR Profile FP Class FP Property hasClass hasProperty allowed Values classType propertyType External Class External Predicate Property Restriction Definition
  63. 63. CoreMicroarrayDistributionMetadata Descriptor: Property #1 This is the URL of the Predicate used by that descriptor FAIR Profile FP Class FP Property hasClass hasProperty allowed Values classType propertyType External Class External Predicate Property Restriction Definition
  64. 64. CoreMicroarrayDistributionMetadata Descriptor: Property #1 This is the “range” of that Predicate within the organizations descriptor FAIR Profile FP Class FP Property hasClass hasProperty allowed Values classType External Class External Predicate Property Restriction Definition propertyType
  65. 65. CoreMicroarrayDistributionMetadata Descriptor: Property #2 Let’s look at a different property from the CoreMicroarrayDistributionMetadata Class FAIR Profile FP Class FP Property hasClass hasProperty allowed Values propertyType External Class External Predicate classType Property Restriction Definition
  66. 66. CoreMicroarrayDistributionMetadata Descriptor: Property #2 FAIR Profile FP Class FP Property hasClass hasProperty allowed Values classType propertyType External Class External Predicate Property Restriction Definition
  67. 67. CoreMicroarrayDistributionMetadata Descriptor: Property #2 This is the label for that property FAIR Profile FP Class FP Property hasClass hasProperty allowed Values classType propertyType External Class External Predicate Property Restriction Definition
  68. 68. CoreMicroarrayDistributionMetadata Descriptor: Property #2 The URL of the predicate of this Property FAIR Profile FP Class FP Property hasClass hasProperty allowed Values classType propertyType External Class External Predicate Property Restriction Definition
  69. 69. CoreMicroarrayDistributionMetadata Descriptor: Property #2 In the Metadata Descriptor, this property is constrained by the set of ontology terms defined in the SKOS Concept Scheme EDAM_Microarray_Data_Format FAIR Profile FP Class FP Property hasClass hasProperty allowed Values classType External Class External Predicate Property Restriction Definition propertyType
  70. 70. <rdf:Description xmlns:ns1="http://www.w3.org/2002/07/owl#" rdf:about="http://biordf.org/DataFairPort/ConceptSchemes/EDAM_Microarray_Data_Format"> <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Ontology"/> <rdf:type rdf:resource="http://www.w3.org/2004/02/skos/core#ConceptScheme"/> <ns1:imports rdf:resource="http://purl.bioontology.org/ontology/EDAM"/> </rdf:Description> <rdf:Description xmlns:ns1="http://www.w3.org/2000/01/rdf-schema#" xmlns:ns2="http://www.w3.org/2004/02/skos/core#" rdf:about="http://edamontology.org/format_1641"> <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/> <rdf:type rdf:resource="http://www.w3.org/2004/02/skos/core#Concept"/> <ns1:label>affymetrix-exp</ns1:label> <ns2:broader rdf:resource="http://edamontology.org/format_2056"/> <ns2:inScheme rdf:resource="http://biordf.org/DataFairPort/ConceptSchemes/EDAM_Microarray_Data_Format"/> </rdf:Description> <rdf:Description xmlns:ns1="http://www.w3.org/2000/01/rdf-schema#" xmlns:ns2="http://www.w3.org/2004/02/skos/core#" rdf:about="http://edamontology.org/format_2056"> <rdf:type rdf:resource="http://www.w3.org/2002/07/owl#NamedIndividual"/> <rdf:type rdf:resource="http://www.w3.org/2004/02/skos/core#Concept"/> <ns1:label>Microarray experiment data format</ns1:label> <ns2:broader rdf:resource="http://biordf.org/DataFairPort/ConceptSchemes/EDAM_Microarray_Data_Format"/> <ns2:inScheme rdf:resource="http://biordf.org/DataFairPort/ConceptSchemes/EDAM_Microarray_Data_Format"/> </rdf:Description> http://biordf.org/DataFairPort/ConceptSchemes/EDAM_Microarray_Data_Format This is a “SKOSified” view of the EDAM Ontology Jupp, et al., “Taking a view on bio-ontologies” ceur-ws.org/Vol-897/session4-paper22.pdf
  71. 71. A DCAT Profile Return to the very top of our FAIR Profile Follow the ExtendedAuthorship Class FAIR Profile FP Class FP Property Property Restriction Definition hasClass hasProperty allowed Values classType propertyType External Class External Predicate
  72. 72. ExtendedAuthorship Follow one of the properties of the ExtendedAuthorship Class FAIR Profile FP Class FP Property hasClass hasProperty allowed Values propertyType External Class External Predicate classType Property Restriction Definition
  73. 73. Author ORCID FAIR Profile FP Class FP Property hasClass hasProperty allowed Values classType propertyType External Class External Predicate Property Restriction Definition
  74. 74. Author ORCID The allowed values of this Property are constrained to be individuals that follow the FAIR Profile Schema “DemoORCIDProfileScheme” FAIR Profile FP Class FP Property hasClass hasProperty allowed Values classType External Class External Predicate Property Restriction Definition propertyType
  75. 75. http://biordf.org/DataFairPort/ProfileSchemas/DemoORCIDProfileScheme.rdf FAIR Profile FP Class FP Property Property Restriction Definition hasClass hasProperty allowed Values classType propertyType External Class External Predicate
  76. 76. http://biordf.org/DataFairPort/ProfileSchemas/DemoORCIDProfileScheme.rdf FAIR Profile FP Class FP Property hasClass hasProperty allowed Values classType External Class External Predicate propertyType This is parsed in exactly the same way as our original DemoMicroarrayProfileScheme, but is embedded within it as the value of the author_ORCID property. …Arbitrary, hierarchical layers of complexity… FAIR Profile FP Class hasClass hasProperty classType External Class
  77. 77. So to build an interface (e.g. query or data-capture) from a FAIR Profile: [1] Parse all FAIR Profile classes Parse the properties of each class Determine the target predicate Determine the target value-restrictions Call [1] if restriction is a FAIR Profile Create a metadata [capture/query] facet with that predicate and that restriction
  78. 78. DCAT Profile Class #1 DCAT Profile Class #2 DCAT Profile Class #3 DCAT Profile Class #4 (embedded) Value constraints Descriptor-specific labels associated with ontology predicates (if applicable) “Classes” may be associated with an ontology to allow reasoning, or may just represent an “arbitrary” grouping of properties within the Target metadata descriptor Metadata Descriptor-specific details are captured e.g. this field is required by this target Metadata Descriptor
  79. 79. Other features of FAIR profiles ● Do not require repository participation ● Provides a purpose-driven, potentially non-comprehensive “view” on a repository, of which there may be many, according to what the profile author needs to cross-query ● Profiles of any given repository facet are not required to be identical! e.g. A different profile might utilize a different controlled vocabulary over any given facet (e.g. a freetext facet) ● Anybody can define a profile (of course, the profile defined by the repository owner should be considered “canonical”... the rest are just purpose-built “best-guesses”) ● FAIR profiles can/should be indexed and shared, to facilitate cross- repository interoperability and integration ● There is no (obvious) reason why a FAIR profile could not be used to describe the DATA in the repository, not just the metadata...
  80. 80. Nothin’ ain’t worth nothin’, but it’s free! -- Kris Kristofferson “All problems in computer science can be solved by another level of indirection ...But that usually will create another problem." -- David Wheeler
  81. 81. Nothin’ ain’t worth nothin’, but it’s free! The FAIR profile isn’t “a magic bean”! It DOES NOT ACCOMPLISH SEMANTIC MAPPING between one field in one repository, and a semantically- related field in another repository
  82. 82. Nothin’ ain’t worth nothin’, but it’s free! The FAIR profile isn’t “a magic bean”! It does give us a standard way to identify, describe, and meta-link these fields, and a predictable place where a mapping mechanism could be injected.
  83. 83. Nothin’ ain’t worth nothin’, but it’s free! The FAIR profile isn’t “a magic bean”! ...we don’t inject it (yet!) because that would require invention of yet another “standard”, and we want to avoid that if possible!
  84. 84. Nothin’ ain’t worth nothin’, but it’s free! The FAIR profile isn’t “a magic bean”! There may be some in the audience who, like me, recognize that this problem is nearly identical to the problem faced by the WSDL -> SAWSDL community. I will be looking at their solution for guidance in the next phase of FAIR Profiles... … so we still have problems, but at least they are now re-defined as problems for which there are solutions!
  85. 85. Skunkworks Participants ● Mark Wilkinson ● Michel Dumontier ● Barend Mons ● Tim Clark ● Jun Zhao ● Paolo Ciccarese ● Paul Groth ● Erik van Mulligen ● Luiz Olavo Bonino da Silva Santos ● Matthew Gamble ● Carole Goble ● Joël Kuiper ● Morris Swertz ● Erik Schultes ● Erik Schultes ● Mercè Crosas ● Adrian Garcia ● Barend Mons ● Philip Durbin ● Jeffrey Grethe ● Katy Wolstencroft ● Sudeshna Das ● M. Emily Merrill
  86. 86. Post-presentation comments We should look at ISO 11179 -> are we duplicating those efforts or are we creating something that is an implementation of those efforts? See also Dublin Core’s similar initiative.

×