Successfully reported this slideshow.
Your SlideShare is downloading. ×

Bioschemas Community: Developing profiles over Schema.org to make life sciences resources more findable

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Loading in …3
×

Check these out next

1 of 12 Ad

Bioschemas Community: Developing profiles over Schema.org to make life sciences resources more findable

Download to read offline

The Bioschemas community (http://bioschemas.org) is a loose collaboration formed by a wide range of life science resource providers and informaticians. The community is developing profiles over Schema.org to enable life science resources such as data about a specific protein, sample, or training event, to be more discoverable on the web. While the content of well-known resources such as Uniprot (for protein data) are easily discoverable, there is a long tail of specialist resources that would benefit from embedding Schema.org markup in a standardised approach.
The community have developed twelve profiles for specific types of life science resources (http://bioschemas.org/specifications/), with another six at an early draft stage. For each profile, a set of use cases have been identified. These typically focus on search, but several facilitate lightweight data exchange to support data aggregators such as Identifiers.org, FAIRsharing.org, and BioSamples. The next stage of the development of a profile consists of mapping the terms used in the use cases to existing properties in Schema.org and domain ontologies. The properties are then prioritised in order to support the use cases, with a minimal set of about six properties identified, along with a larger set of recommended and optional properties. For each property, an expected cardinality is defined and where appropriate, object values are specified from controlled vocabularies. Before a profile is finalised, it must first be demonstrated that resources can deploy the markup.
In this talk, we will outline the progress that has been made by the Bioschemas Community in a single year through three hackathon events. We will discuss the processes followed by the Bioschemas Community to foster collaboration, and highlight the benefits and drawbacks of using open Google documents and spreadsheets to support the community develop the profiles. We will conclude by summarising future opportunities and directions for the community.

The Bioschemas community (http://bioschemas.org) is a loose collaboration formed by a wide range of life science resource providers and informaticians. The community is developing profiles over Schema.org to enable life science resources such as data about a specific protein, sample, or training event, to be more discoverable on the web. While the content of well-known resources such as Uniprot (for protein data) are easily discoverable, there is a long tail of specialist resources that would benefit from embedding Schema.org markup in a standardised approach.
The community have developed twelve profiles for specific types of life science resources (http://bioschemas.org/specifications/), with another six at an early draft stage. For each profile, a set of use cases have been identified. These typically focus on search, but several facilitate lightweight data exchange to support data aggregators such as Identifiers.org, FAIRsharing.org, and BioSamples. The next stage of the development of a profile consists of mapping the terms used in the use cases to existing properties in Schema.org and domain ontologies. The properties are then prioritised in order to support the use cases, with a minimal set of about six properties identified, along with a larger set of recommended and optional properties. For each property, an expected cardinality is defined and where appropriate, object values are specified from controlled vocabularies. Before a profile is finalised, it must first be demonstrated that resources can deploy the markup.
In this talk, we will outline the progress that has been made by the Bioschemas Community in a single year through three hackathon events. We will discuss the processes followed by the Bioschemas Community to foster collaboration, and highlight the benefits and drawbacks of using open Google documents and spreadsheets to support the community develop the profiles. We will conclude by summarising future opportunities and directions for the community.

Advertisement
Advertisement

More Related Content

More from Alasdair Gray (20)

Advertisement

Recently uploaded (20)

Bioschemas Community: Developing profiles over Schema.org to make life sciences resources more findable

  1. 1. Alasdair J G Gray Heriot-Watt University Bioschemas Community http://bioschemas.org/people/ Bioschemas Community: Developing profiles over Schema.org to make life sciences resources more findable
  2. 2. 30 April 2018 #bioschemas 2 <div itemscope itemtype="http://schema.org/Recipe"> <h1 itemprop="name">Classic potato salad</h1> <div itemprop="nutrition” itemscope itemtype="http://schema.org/NutritionInformation"> Nutrition facts: <span itemprop="calories">144 kcal</span>, </div> Ingredients: - <span itemprop="recipeIngredient">800g small new potato</span> - <span itemprop="recipeIngredient">3 shallot</span> Schema.org: markup for web pages RDFa JSON-LD Microdata With markup
  3. 3. 30 April 2018 #bioschemas 3
  4. 4. Bioschemas • Schema.org for life sciences –Introduce life sciences types • Use case driven –Finding data –Presenting search results –Metadata exchange • Minimum properties – 6 • Link to domain ontologies Specification on top of schema.org Layer of constrains + documentation + extensions Specification Data model Minimum information Controlled vocabularies Cardinality Documentation Examples New (properties | types) 30 April 2018 #bioschemas 4
  5. 5. 30 April 2018 #bioschemas 5
  6. 6. Bioschemas community
  7. 7. 15Specifications 18Working groups 12Code repositories 18Events 229GitHub issues 468Community emails 49Data resources 16Live deploys 223People35PMs (funding) Bioschemas Community 30 April 2018 #bioschemas 7
  8. 8. Mapping SpecificationUse cases Mockup Adoption Testing Application Bioschemas Process 30 April 2018 #bioschemas 8
  9. 9. Bioschema Profiles http://bioschemas.org/specifications/ 30 April 2018 #bioschemas 9
  10. 10. New Types for Schema.org 30 April 2018 #bioschemas 10
  11. 11. 30 April 2018 #bioschemas 12 Adoption: http://bioschemas.org/liveDeploys/
  12. 12. Acknowledgements http://bioschemas.org/people/ 30 April 2018 #bioschemas 13

Editor's Notes

  • Machine interpretable

    This example is with Microdata
  • Currently no rich search results
    Would like something more like this
  • Define use case
    Metadata crosswalk and mapping to schema.org
    Metadata providers
    Metadata registries
    Standards defining metadata
    Bioschemas specification
    Define minimum properties based on “finding” use cases
    Define cardinality and suggested controlled vocabularies
    Test with existing entries
    Adoption by data repositories and registries
    Applications
  • Beacon
    Data Catalog
    Dataset
    Event
    Laboratory Protocol
    Organization
    Person
    Phenotype
    Protein
    Protein Annotation
    Protein Structure
    Sample
    Standard
    Tool
    Training Material

×