SlideShare a Scribd company logo
Improving discoverability for Life
Sciences resources
Alasdair J.G. Gray
Bioschemas Leadership Team Chair
Heriot-Watt University/Elixir-UK
Bioschemas
ELIXIR All Hands Tutorial
Lisbon, Portugal – 19 June 2019
Google Search
2http://bioschemas.org
Google Search
3http://bioschemas.org
Google Dataset Search (Sept 2018)
4
https://toolbox.google.com/datasetsearch
http://bioschemas.org
https://www.blog.google/products/search/making-it-easier-discover-datasets/
Picture: Carole Goble, Turing Lecture 2018
Schema.org: Semantic Markup for the Web
Structured data → descriptors
● Types
(614)
What we can
say about
those things
● Properties
(905)
What we are
talking about
Bioschemas
• Community initiative built on top of
schema.org
• Aim
• Improve data discoverability and
interoperability in Life Sciences
• Approach
• Add Life Science types to schema.org
• Provide usage guidelines and examples
• 6 Minimal properties
• Link to domain ontologies
• Support software
Profile over schema.org
Layer of constraints + documentation +
extensions Specification
Data model
Minimum information
Controlled vocabularies
Cardinality
Documentation
Examples
New (properties | types)
Findable Accessible Interoperable Reusable
★Globally unique
identifiers
★Community
defined enriched
metadata
★Indexable by
search engines
★JSON-LD/RDFa
★Link to
controlled
vocabularies
★Links to other
resources
★ License
★ Provenance
★Retrievable
★HTTP
Schema.org for Datasets
Schema definition:
●Dataset: A body of structured
information describing some
topic(s) of interest
http://schema.org/Dataset
●91 properties including:
○name
○description
○isFamilyFriendly
9
Google Dataset Profile
• 2 required properties
• Used for Google Dataset Search
• 10 recommended properties
• Link to DataCatalog
• Link to DataDownload
Other profiles: Events, Jobs,
...
https://developers.google.com/search/docs/data-types/dataset
Google Dataset Profile
Compliant with Google
Dataset Profile
• 5 minimal properties
• 8 recommended properties
• Link to DataCatalog
• Link to DataDownload
http://bioschemas.org/specifications/Dataset/
Bioschemas Dataset Profile
Extending Schema.org for the Life Sciences
7 release candidates
Submission in progress!
More types in development
14
Profile Version Group Live Deploys Status notes
DataCatalog 0.2 (Jun 2019) Data Repos 20 0.2 fixes minor issues
Dataset 0.3 (Jun 2019) Datasets 23 0.3 fixes minor issues
Event 0.1 (July 2018) Events 7 Used by TeSS: undergoing revision due to addition of CourseInstance
Sample 0.2 (Nov 2018) Samples 1
Taxon 0.3 (Nov 2018) Biodiversity 0
Tool 0.1 (Mar 2018) Tools 5 0.3-DRAFT based on bio.tools profile, needs review
TrainingMaterial 0.2 (July 2018) Training 0 Used by TeSS: 0.5-DRAFT incorporating changes from Course
Current Bioschemas Profiles
Draft Bioschemas Profiles
15
● Beacon: 0.2-DRAFT 2018-04-23
● BioSample: 0.1-DRAFT
● ChemicalSubstance: 0.2-DRAFT 2019-06-11
● Course: 0.6-DRAFT 2019-06-06
● CourseInstance: 0.6-DRAFT 2019-06-06
● DNA: 0.1-DRAFT 2018-11-13
● DataRecord: 0.2-DRAFT 2019-06-14
● Gene: 0.5-DRAFT 2019-06-14
● Journal: 0.1-DRAFT 2019-02-08
● LabProtocol: 0.3-DRAFT 2019-06-14
● MolecularEntity: 0.2-DRAFT 2019-11-15
● Organization: 0.1-DRAFT 2018-03-13
● Person: 0.1-DRAFT 2018-03-14
● Phenotype: 0.1-DRAFT 2018-11-15
● Protein: 0.8-DRAFT 2019-05-08
● ProteinAnnotation: 0.4-DRAFT 2018-02-25
● ProteinStructure: 0.5-DRAFT 2018-08-15
● PublicationIssue: 0.1-DRAFT 2019-02-08
● PublicationVolume: 0.1-DRAFT 2019-02-08
● ScholarlyArticle: 0.1-DRAFT 2019-02-08
● SemanticAnnotation: 0.1-DRAFT 2019-02-08
● Standard: 0.1-DRAFT 2018-01-01
● Study: 0.1-DRAFT 2018-11-15
● Tool: 0.3-DRAFT 2018-11-21
● TrainingMaterial: 0.6-DRAFT 2019-06-06
● Workflow: 0.1-DRAFT 2019-02-08
Mapping ProfileUse cases
Mockup
Adoption
Testing Application
Profile Creation Process
Bioschemas Software
29 November 2018 http://bioschemas.org 19
Bioschemas Generator
● Supports all profiles
○ Current and draft
● Validates input
● Form generated from
YAML description
● Examples extracted from
profile
Exploiting Bioschemas Markup
TeSS: Specialised Search
http://bioschemas.org
• contact
• description
• endDate
• eventType
• hostInstitution
• location
• name
• startDate
• …
Bioschemas Event:
29 November 2018 21
http://bioschemas.org
• description
• keywords
• name
• provider
• url
Bioschema DataCatalog:
• alternateName
• citation
• dateCreated
• licence
• …
Automated Data Curation
Data Exchange: Without an API
MarRef → BioSamples
https://github.com/EBIBioSamples/bioschemas_marref_demo/blob/master/Summary.md
BKG Explorer
Built over Bioschemas markup crawled from 30 live deployments
20,000 pages
Bioschemas
What?
• Exploiting schema.org to make Life Sciences
resources more discoverable
• Search engines will index and understand
markup
How?
• Extending schema.org vocabulary for life
sciences
• 7 release candidate types
• Provide guidelines on how to markup
resources
200+
People
7
Tutorial
s
(2018)
17Type
s
6Publications
(2018)30Profiles
62Sites
11M+
Pages
Bioschemas Community
http://bioschemas.org/liveDeploys
http://bioschemas.org/
liveDeploys
http://bioschemas.org
Acknowledgements http://bioschemas.org/people
http://bioschemas.org/ @bioschemas https://github.com/bioschemas/
Join Bioschemas: http://bioschemas.org/howtojoin/
Creating and Deploying
Bioschemas Markup
Material from: Justin Clark-Casey
License: Attribution 4.0 International (CC BY 4.0)
Kenneth McLeod
Creating Bioschemas markup
● Markup is in a format called JSON-LD
● Embedded directly into webpages
● Let’s look at an example of the DataCatalog schema as used by Bioschemas
○ This comes from schema.org but Bioschemas adds
■ Mandatory/recommended/optional properties
■ Cardinality constraints
Markup can be placed in either the
head or the body.
Let’s look at this in Google’s Structued
Data Testing Tool
@context is overwritten by Google
Technically any prefixes can be defined here, e.g.,
"@context":["https://schema.org", {"OBI":"http://purl.obolibrary.org/obo/OBI_" ...}],
"@type":["Sample","OBI:0000747"] …
BUT, Google will overwrite this with the basic "@context": "http://schema.org"
@id - gives a node a URL
Without @id there are auto-generated URLs for nodes, e.g.,
<script type="application/ld+json">{
"@context" : "https://schema.org",
"@type" : "DataCatalog", ...
becomes:
_:genid2d4335ed7c72694275bea5b6a86ad9f82b2db0
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
<https://schema.org/DataCatalog> .
Bad for Linked Data as no one can reference this.
@id - gives a node a URL
With an@id you choose the URL for nodes, e.g.,
<script type="application/ld+json">{
"@context" : "https://schema.org",
"@type" : "DataCatalog",
"@id" : "https://www.ebi.ac.uk/biosamples" …
becomes:
<https://www.ebi.ac.uk/biosamples> <http://www.w3.org/1999/02/22-rdf-syntax-
ns#type> <https://schema.org/DataCatalog> .
Warning! Don’t use the same @id for everything
DataCatalog & Dataset defined separately, but combined into a single entity:
GSDTT
common errors
If you don’t meet
Google’s
desired property
specification for
a given type you see
errors like:
If Bioschemas spec says this is OK, you can
ignore error (FYI it is a real error)
Not min properties in
Bioschemas; do what you
want
This error is caused by the
incorrect target type of location.
Description is min property
for Bioschemas (ie
mandatory)
Bioschema’s Types not yet accepted by Schema:
Ignore these
Markup Generator
Example:
https://bio.tools/blast
https://blast.ncbi.nlm.nih.gov/Blast.cgi
https://bioschemas.org/devSpecs/Tool/
Evolving Best Practices
● At the moment we largely create markup by hand with validation through
Google’s testing tool
○ More validators and tools on the way, see bioschemas.org/tools
● Make pages with markup reachable from your sitemap.xml
○ This will make it easier for some applications to find it.
● Avoid adding Bioschemas markup to the page dynamically (e.g. through
Javascript)
○ Applications trying to find your data may not have the resources to render pages.
● Specify an @id
● Evolving guidance at
https://github.com/BioSchemas/specifications/wiki/Technical
Questions?
● bioschemas.org
● bioschemas.org/groups/Technical
● https://bioschemas.org/software/
● Google Structured Data Testing Tool
● kcm1@hw.ac.uk

More Related Content

Similar to Make your Web resources more discoverable with Bioschemas markup –Bioschemas Tutorial June 2019

Towards Automating Data Narratives
Towards Automating Data NarrativesTowards Automating Data Narratives
Towards Automating Data Narratives
dgarijo
 
Supercharging your Organic CTR
Supercharging your Organic CTRSupercharging your Organic CTR
Supercharging your Organic CTR
Phil Pearce
 
scholarresearchinformation-130225230116-phpapp02.ppt
scholarresearchinformation-130225230116-phpapp02.pptscholarresearchinformation-130225230116-phpapp02.ppt
scholarresearchinformation-130225230116-phpapp02.ppt
DrSandeepKautish
 
Preservation Metadata, CARLI Metadata Matters series, December 2010
Preservation Metadata, CARLI Metadata Matters series, December 2010Preservation Metadata, CARLI Metadata Matters series, December 2010
Preservation Metadata, CARLI Metadata Matters series, December 2010
Claire Stewart
 
Advanced data-driven technical SEO - SMX London 2019
Advanced data-driven technical SEO - SMX London 2019Advanced data-driven technical SEO - SMX London 2019
Advanced data-driven technical SEO - SMX London 2019
Bastian Grimm
 
Bioschemas Workshop
Bioschemas WorkshopBioschemas Workshop
Bioschemas Workshop
Niall Beard
 
FAIR Cookbook
FAIR Cookbook FAIR Cookbook
FAIR Cookbook
Susanna-Assunta Sansone
 
Linking Software: citations, roles, references and more
Linking Software: citations, roles, references and moreLinking Software: citations, roles, references and more
Linking Software: citations, roles, references and more
Repository Fringe
 
Supercharge your data analytics with BigQuery
Supercharge your data analytics with BigQuerySupercharge your data analytics with BigQuery
Supercharge your data analytics with BigQuery
Márton Kodok
 
HandsonSystematicLiterature ReviewForHighImpactResearch.pdf
HandsonSystematicLiterature ReviewForHighImpactResearch.pdfHandsonSystematicLiterature ReviewForHighImpactResearch.pdf
HandsonSystematicLiterature ReviewForHighImpactResearch.pdf
Rathish Chandra Gatti,Ph.D
 
Datasets with bioschemas
Datasets with bioschemasDatasets with bioschemas
Datasets with bioschemas
Alejandra Gonzalez-Beltran
 
College for Computer & Information Sciences 3333 Regis Boule.docx
College for Computer & Information Sciences  3333 Regis Boule.docxCollege for Computer & Information Sciences  3333 Regis Boule.docx
College for Computer & Information Sciences 3333 Regis Boule.docx
clarebernice
 
Linked Data for improved organization of research data
Linked Data  for improved organization  of research dataLinked Data  for improved organization  of research data
Linked Data for improved organization of research data
Samuel Lampa
 
Module 6B - New GBIF Tools II 2013: Portal and NPT Startup
Module 6B - New GBIF Tools II 2013:  Portal and NPT StartupModule 6B - New GBIF Tools II 2013:  Portal and NPT Startup
Module 6B - New GBIF Tools II 2013: Portal and NPT Startup
Alberto González-Talaván
 
[Workshop] Best-Practice Tech Sourcing, Susanna Frazier - Recruiters’ Hub New...
[Workshop] Best-Practice Tech Sourcing, Susanna Frazier - Recruiters’ Hub New...[Workshop] Best-Practice Tech Sourcing, Susanna Frazier - Recruiters’ Hub New...
[Workshop] Best-Practice Tech Sourcing, Susanna Frazier - Recruiters’ Hub New...
Susanna Frazier
 
Bioschemas: Datasets and Data Catalogs
Bioschemas: Datasets and Data CatalogsBioschemas: Datasets and Data Catalogs
Bioschemas: Datasets and Data Catalogs
Bioschemas
 
Introduction to Microdata & Google Rich Snippets
Introduction to Microdata  & Google Rich SnippetsIntroduction to Microdata  & Google Rich Snippets
Introduction to Microdata & Google Rich Snippets
Kishan Gor
 
Building Data Apps with Python
Building Data Apps with PythonBuilding Data Apps with Python
Building Data Apps with Python
Benjamin Bengfort
 
CGSpace and PRMS Information Session
CGSpace and PRMS Information SessionCGSpace and PRMS Information Session
CGSpace and PRMS Information Session
ILRI
 
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologiesIntroduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
Chris Schalk
 

Similar to Make your Web resources more discoverable with Bioschemas markup –Bioschemas Tutorial June 2019 (20)

Towards Automating Data Narratives
Towards Automating Data NarrativesTowards Automating Data Narratives
Towards Automating Data Narratives
 
Supercharging your Organic CTR
Supercharging your Organic CTRSupercharging your Organic CTR
Supercharging your Organic CTR
 
scholarresearchinformation-130225230116-phpapp02.ppt
scholarresearchinformation-130225230116-phpapp02.pptscholarresearchinformation-130225230116-phpapp02.ppt
scholarresearchinformation-130225230116-phpapp02.ppt
 
Preservation Metadata, CARLI Metadata Matters series, December 2010
Preservation Metadata, CARLI Metadata Matters series, December 2010Preservation Metadata, CARLI Metadata Matters series, December 2010
Preservation Metadata, CARLI Metadata Matters series, December 2010
 
Advanced data-driven technical SEO - SMX London 2019
Advanced data-driven technical SEO - SMX London 2019Advanced data-driven technical SEO - SMX London 2019
Advanced data-driven technical SEO - SMX London 2019
 
Bioschemas Workshop
Bioschemas WorkshopBioschemas Workshop
Bioschemas Workshop
 
FAIR Cookbook
FAIR Cookbook FAIR Cookbook
FAIR Cookbook
 
Linking Software: citations, roles, references and more
Linking Software: citations, roles, references and moreLinking Software: citations, roles, references and more
Linking Software: citations, roles, references and more
 
Supercharge your data analytics with BigQuery
Supercharge your data analytics with BigQuerySupercharge your data analytics with BigQuery
Supercharge your data analytics with BigQuery
 
HandsonSystematicLiterature ReviewForHighImpactResearch.pdf
HandsonSystematicLiterature ReviewForHighImpactResearch.pdfHandsonSystematicLiterature ReviewForHighImpactResearch.pdf
HandsonSystematicLiterature ReviewForHighImpactResearch.pdf
 
Datasets with bioschemas
Datasets with bioschemasDatasets with bioschemas
Datasets with bioschemas
 
College for Computer & Information Sciences 3333 Regis Boule.docx
College for Computer & Information Sciences  3333 Regis Boule.docxCollege for Computer & Information Sciences  3333 Regis Boule.docx
College for Computer & Information Sciences 3333 Regis Boule.docx
 
Linked Data for improved organization of research data
Linked Data  for improved organization  of research dataLinked Data  for improved organization  of research data
Linked Data for improved organization of research data
 
Module 6B - New GBIF Tools II 2013: Portal and NPT Startup
Module 6B - New GBIF Tools II 2013:  Portal and NPT StartupModule 6B - New GBIF Tools II 2013:  Portal and NPT Startup
Module 6B - New GBIF Tools II 2013: Portal and NPT Startup
 
[Workshop] Best-Practice Tech Sourcing, Susanna Frazier - Recruiters’ Hub New...
[Workshop] Best-Practice Tech Sourcing, Susanna Frazier - Recruiters’ Hub New...[Workshop] Best-Practice Tech Sourcing, Susanna Frazier - Recruiters’ Hub New...
[Workshop] Best-Practice Tech Sourcing, Susanna Frazier - Recruiters’ Hub New...
 
Bioschemas: Datasets and Data Catalogs
Bioschemas: Datasets and Data CatalogsBioschemas: Datasets and Data Catalogs
Bioschemas: Datasets and Data Catalogs
 
Introduction to Microdata & Google Rich Snippets
Introduction to Microdata  & Google Rich SnippetsIntroduction to Microdata  & Google Rich Snippets
Introduction to Microdata & Google Rich Snippets
 
Building Data Apps with Python
Building Data Apps with PythonBuilding Data Apps with Python
Building Data Apps with Python
 
CGSpace and PRMS Information Session
CGSpace and PRMS Information SessionCGSpace and PRMS Information Session
CGSpace and PRMS Information Session
 
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologiesIntroduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
 

More from Bioschemas

Bioschemas findability and interoperability
Bioschemas findability and interoperabilityBioschemas findability and interoperability
Bioschemas findability and interoperability
Bioschemas
 
Bioschemas community: Developing profiles over Schema.org to make life scienc...
Bioschemas community: Developing profiles over Schema.org to make life scienc...Bioschemas community: Developing profiles over Schema.org to make life scienc...
Bioschemas community: Developing profiles over Schema.org to make life scienc...
Bioschemas
 
Bioschemas overview
Bioschemas overviewBioschemas overview
Bioschemas overview
Bioschemas
 
Bioschemas: Using Schema.org for describing scientific information
Bioschemas: Using Schema.org for describing scientific information Bioschemas: Using Schema.org for describing scientific information
Bioschemas: Using Schema.org for describing scientific information
Bioschemas
 
Bioschemas at bio hackathon 2017
Bioschemas at bio hackathon 2017Bioschemas at bio hackathon 2017
Bioschemas at bio hackathon 2017
Bioschemas
 
Bioschemas: Introduction and Implementation Study Overview
Bioschemas: Introduction and Implementation Study OverviewBioschemas: Introduction and Implementation Study Overview
Bioschemas: Introduction and Implementation Study Overview
Bioschemas
 

More from Bioschemas (6)

Bioschemas findability and interoperability
Bioschemas findability and interoperabilityBioschemas findability and interoperability
Bioschemas findability and interoperability
 
Bioschemas community: Developing profiles over Schema.org to make life scienc...
Bioschemas community: Developing profiles over Schema.org to make life scienc...Bioschemas community: Developing profiles over Schema.org to make life scienc...
Bioschemas community: Developing profiles over Schema.org to make life scienc...
 
Bioschemas overview
Bioschemas overviewBioschemas overview
Bioschemas overview
 
Bioschemas: Using Schema.org for describing scientific information
Bioschemas: Using Schema.org for describing scientific information Bioschemas: Using Schema.org for describing scientific information
Bioschemas: Using Schema.org for describing scientific information
 
Bioschemas at bio hackathon 2017
Bioschemas at bio hackathon 2017Bioschemas at bio hackathon 2017
Bioschemas at bio hackathon 2017
 
Bioschemas: Introduction and Implementation Study Overview
Bioschemas: Introduction and Implementation Study OverviewBioschemas: Introduction and Implementation Study Overview
Bioschemas: Introduction and Implementation Study Overview
 

Recently uploaded

Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
Bill641377
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
aqzctr7x
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
Sm321
 
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
taqyea
 
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Kaxil Naik
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
Social Samosa
 
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
y3i0qsdzb
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
soxrziqu
 
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
wyddcwye1
 
writing report business partner b1+ .pdf
writing report business partner b1+ .pdfwriting report business partner b1+ .pdf
writing report business partner b1+ .pdf
VyNguyen709676
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
Walaa Eldin Moustafa
 
Build applications with generative AI on Google Cloud
Build applications with generative AI on Google CloudBuild applications with generative AI on Google Cloud
Build applications with generative AI on Google Cloud
Márton Kodok
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
ihavuls
 
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens""Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
sameer shah
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
xclpvhuk
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
a9qfiubqu
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
Timothy Spann
 

Recently uploaded (20)

Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
 
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
一比一原版(harvard毕业证书)哈佛大学毕业证如何办理
 
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
 
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
 
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
原版一比一利兹贝克特大学毕业证(LeedsBeckett毕业证书)如何办理
 
writing report business partner b1+ .pdf
writing report business partner b1+ .pdfwriting report business partner b1+ .pdf
writing report business partner b1+ .pdf
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
 
Build applications with generative AI on Google Cloud
Build applications with generative AI on Google CloudBuild applications with generative AI on Google Cloud
Build applications with generative AI on Google Cloud
 
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
原版制作(unimelb毕业证书)墨尔本大学毕业证Offer一模一样
 
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens""Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
 

Make your Web resources more discoverable with Bioschemas markup –Bioschemas Tutorial June 2019

  • 1. Improving discoverability for Life Sciences resources Alasdair J.G. Gray Bioschemas Leadership Team Chair Heriot-Watt University/Elixir-UK Bioschemas ELIXIR All Hands Tutorial Lisbon, Portugal – 19 June 2019
  • 4. Google Dataset Search (Sept 2018) 4 https://toolbox.google.com/datasetsearch http://bioschemas.org https://www.blog.google/products/search/making-it-easier-discover-datasets/
  • 5. Picture: Carole Goble, Turing Lecture 2018 Schema.org: Semantic Markup for the Web
  • 6. Structured data → descriptors ● Types (614) What we can say about those things ● Properties (905) What we are talking about
  • 7. Bioschemas • Community initiative built on top of schema.org • Aim • Improve data discoverability and interoperability in Life Sciences • Approach • Add Life Science types to schema.org • Provide usage guidelines and examples • 6 Minimal properties • Link to domain ontologies • Support software Profile over schema.org Layer of constraints + documentation + extensions Specification Data model Minimum information Controlled vocabularies Cardinality Documentation Examples New (properties | types)
  • 8. Findable Accessible Interoperable Reusable ★Globally unique identifiers ★Community defined enriched metadata ★Indexable by search engines ★JSON-LD/RDFa ★Link to controlled vocabularies ★Links to other resources ★ License ★ Provenance ★Retrievable ★HTTP
  • 9. Schema.org for Datasets Schema definition: ●Dataset: A body of structured information describing some topic(s) of interest http://schema.org/Dataset ●91 properties including: ○name ○description ○isFamilyFriendly 9
  • 10. Google Dataset Profile • 2 required properties • Used for Google Dataset Search • 10 recommended properties • Link to DataCatalog • Link to DataDownload Other profiles: Events, Jobs, ... https://developers.google.com/search/docs/data-types/dataset Google Dataset Profile
  • 11. Compliant with Google Dataset Profile • 5 minimal properties • 8 recommended properties • Link to DataCatalog • Link to DataDownload http://bioschemas.org/specifications/Dataset/ Bioschemas Dataset Profile
  • 12. Extending Schema.org for the Life Sciences 7 release candidates Submission in progress!
  • 13. More types in development
  • 14. 14 Profile Version Group Live Deploys Status notes DataCatalog 0.2 (Jun 2019) Data Repos 20 0.2 fixes minor issues Dataset 0.3 (Jun 2019) Datasets 23 0.3 fixes minor issues Event 0.1 (July 2018) Events 7 Used by TeSS: undergoing revision due to addition of CourseInstance Sample 0.2 (Nov 2018) Samples 1 Taxon 0.3 (Nov 2018) Biodiversity 0 Tool 0.1 (Mar 2018) Tools 5 0.3-DRAFT based on bio.tools profile, needs review TrainingMaterial 0.2 (July 2018) Training 0 Used by TeSS: 0.5-DRAFT incorporating changes from Course Current Bioschemas Profiles
  • 15. Draft Bioschemas Profiles 15 ● Beacon: 0.2-DRAFT 2018-04-23 ● BioSample: 0.1-DRAFT ● ChemicalSubstance: 0.2-DRAFT 2019-06-11 ● Course: 0.6-DRAFT 2019-06-06 ● CourseInstance: 0.6-DRAFT 2019-06-06 ● DNA: 0.1-DRAFT 2018-11-13 ● DataRecord: 0.2-DRAFT 2019-06-14 ● Gene: 0.5-DRAFT 2019-06-14 ● Journal: 0.1-DRAFT 2019-02-08 ● LabProtocol: 0.3-DRAFT 2019-06-14 ● MolecularEntity: 0.2-DRAFT 2019-11-15 ● Organization: 0.1-DRAFT 2018-03-13 ● Person: 0.1-DRAFT 2018-03-14 ● Phenotype: 0.1-DRAFT 2018-11-15 ● Protein: 0.8-DRAFT 2019-05-08 ● ProteinAnnotation: 0.4-DRAFT 2018-02-25 ● ProteinStructure: 0.5-DRAFT 2018-08-15 ● PublicationIssue: 0.1-DRAFT 2019-02-08 ● PublicationVolume: 0.1-DRAFT 2019-02-08 ● ScholarlyArticle: 0.1-DRAFT 2019-02-08 ● SemanticAnnotation: 0.1-DRAFT 2019-02-08 ● Standard: 0.1-DRAFT 2018-01-01 ● Study: 0.1-DRAFT 2018-11-15 ● Tool: 0.3-DRAFT 2018-11-21 ● TrainingMaterial: 0.6-DRAFT 2019-06-06 ● Workflow: 0.1-DRAFT 2019-02-08
  • 16.
  • 17. Mapping ProfileUse cases Mockup Adoption Testing Application Profile Creation Process
  • 18. Bioschemas Software 29 November 2018 http://bioschemas.org 19 Bioschemas Generator ● Supports all profiles ○ Current and draft ● Validates input ● Form generated from YAML description ● Examples extracted from profile
  • 20. TeSS: Specialised Search http://bioschemas.org • contact • description • endDate • eventType • hostInstitution • location • name • startDate • … Bioschemas Event: 29 November 2018 21
  • 21. http://bioschemas.org • description • keywords • name • provider • url Bioschema DataCatalog: • alternateName • citation • dateCreated • licence • … Automated Data Curation
  • 22. Data Exchange: Without an API MarRef → BioSamples https://github.com/EBIBioSamples/bioschemas_marref_demo/blob/master/Summary.md
  • 23. BKG Explorer Built over Bioschemas markup crawled from 30 live deployments 20,000 pages
  • 24. Bioschemas What? • Exploiting schema.org to make Life Sciences resources more discoverable • Search engines will index and understand markup How? • Extending schema.org vocabulary for life sciences • 7 release candidate types • Provide guidelines on how to markup resources
  • 27. http://bioschemas.org/ @bioschemas https://github.com/bioschemas/ Join Bioschemas: http://bioschemas.org/howtojoin/
  • 28. Creating and Deploying Bioschemas Markup Material from: Justin Clark-Casey License: Attribution 4.0 International (CC BY 4.0) Kenneth McLeod
  • 29. Creating Bioschemas markup ● Markup is in a format called JSON-LD ● Embedded directly into webpages ● Let’s look at an example of the DataCatalog schema as used by Bioschemas ○ This comes from schema.org but Bioschemas adds ■ Mandatory/recommended/optional properties ■ Cardinality constraints
  • 30.
  • 31. Markup can be placed in either the head or the body.
  • 32. Let’s look at this in Google’s Structued Data Testing Tool
  • 33.
  • 34.
  • 35. @context is overwritten by Google Technically any prefixes can be defined here, e.g., "@context":["https://schema.org", {"OBI":"http://purl.obolibrary.org/obo/OBI_" ...}], "@type":["Sample","OBI:0000747"] … BUT, Google will overwrite this with the basic "@context": "http://schema.org"
  • 36. @id - gives a node a URL Without @id there are auto-generated URLs for nodes, e.g., <script type="application/ld+json">{ "@context" : "https://schema.org", "@type" : "DataCatalog", ... becomes: _:genid2d4335ed7c72694275bea5b6a86ad9f82b2db0 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://schema.org/DataCatalog> . Bad for Linked Data as no one can reference this.
  • 37. @id - gives a node a URL With an@id you choose the URL for nodes, e.g., <script type="application/ld+json">{ "@context" : "https://schema.org", "@type" : "DataCatalog", "@id" : "https://www.ebi.ac.uk/biosamples" … becomes: <https://www.ebi.ac.uk/biosamples> <http://www.w3.org/1999/02/22-rdf-syntax- ns#type> <https://schema.org/DataCatalog> .
  • 38. Warning! Don’t use the same @id for everything DataCatalog & Dataset defined separately, but combined into a single entity:
  • 39. GSDTT common errors If you don’t meet Google’s desired property specification for a given type you see errors like: If Bioschemas spec says this is OK, you can ignore error (FYI it is a real error) Not min properties in Bioschemas; do what you want This error is caused by the incorrect target type of location. Description is min property for Bioschemas (ie mandatory)
  • 40. Bioschema’s Types not yet accepted by Schema: Ignore these
  • 45. Evolving Best Practices ● At the moment we largely create markup by hand with validation through Google’s testing tool ○ More validators and tools on the way, see bioschemas.org/tools ● Make pages with markup reachable from your sitemap.xml ○ This will make it easier for some applications to find it. ● Avoid adding Bioschemas markup to the page dynamically (e.g. through Javascript) ○ Applications trying to find your data may not have the resources to render pages. ● Specify an @id ● Evolving guidance at https://github.com/BioSchemas/specifications/wiki/Technical
  • 46. Questions? ● bioschemas.org ● bioschemas.org/groups/Technical ● https://bioschemas.org/software/ ● Google Structured Data Testing Tool ● kcm1@hw.ac.uk