SlideShare a Scribd company logo
1 of 25
Download to read offline
1
Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France
Bioschemas:
Marking up biodiversity websites to improve
data discovery and web-scale integration
* Wimmics: AI in bridging social semantics and formal semantics on the Web
TDWG Webinar, 2021-03-10
Franck MICHEL*
Bioschemas Community http://bioschemas.org/people/
2
Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France
Semantic markup for web pages
3
Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France
4
Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France
Collaborative community project founded in 2011 by
Define a common vocabulary to markup resources on the internet
- Structured data makes resources understandable to search engines
- Improve ranking, discoverability
- Provide informative summarizations
: semantic markup for resources on the internet
schema.org
Microdata
RDFa
Microformats
Markup formats
5
Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France
Collaborative community project founded in 2011 by
Define a common vocabulary to markup resources on the internet
- Structured data makes resources understandable to search engines
- Improve ranking, discoverability
- Provide informative summarizations
Microdata
RDFa
: semantic markup for resources on the internet
Microformats
schema.org
Source: https://w3techs.com/technologies/history_overview/structured_data/all
Markup formats
6
Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France
What we are
talking about:
types (778)
What we can say
about those things:
properties (1369)
: semantic markup for resources on the internet
schema.org
http://schema.org/Person
7
Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France
Webpages
How to share your biodiversity data?
Web API
Linked Data KG
Integrative approach
GBIF, EoL, iDigBio…
simple sophisticated
Flat files
8
Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France
How to share your biodiversity data?
Web API
Linked Data KG
Integrative approach
GBIF, EoL, iDigBio…
simple sophisticated
Webpages Flat files
9
Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France
Bioschemas: schema.org extension for LifeSciences
Community initiative built on top of Schema.org
Aim
Help search engines understand and index webpages
Improve resources discoverability and interoperability
Approach
Reuse/extend Schema.org for life sciences
Keep it simple (no complex domain ontology)
Provide guidelines on how to markup resources
• Minimum/recommended/optional properties
• Link to other vocabularies & domain ontologies
Flexibility: recommandations, not constraints
Support software
Specification
Data model
Minimum information
Controlled vocabularies
Cardinality
Documentation
Examples
New (properties | types)
10
Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France
Currently defined terms
‒ ChemicalSubstance
‒ DataCatalog
‒ Dataset
‒ Gene
‒ MolecularEntity
‒ Protein
‒ Sample
‒ Taxon
More terms to come
‒ BioSample
‒ ComputationalTool
‒ ComputationalWorkflow
‒ LabProtocol
‒ Phenotype
‒ ProteinStructure
‒ RNA
‒ TaxonName
‒ …
Bioschemas: schema.org extension for LifeSciences
11
Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France
Taxon Type: https://bioschemas.org/types/Taxon meant to become http://schema.org/Taxon
Profile: https://bioschemas.org/profiles/Taxon provides usage recommendations
dwc:vernacularName
12
Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France
Example markup of a page about taxon Delphinapterus leucas
<script type="application/ld+json">
{
"@context": "http://schema.org",
"@type" : "Taxon",
"name": "Delphinapterus leucas (Pallas, 1776)",
"taxonRank": "species"
}
</script>
13
Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France
Example markup of a page about taxon Delphinapterus leucas
<script type="application/ld+json">
{
"@context": [
"http://schema.org",
{ "dwc": "http://rs.tdwg.org/dwc/terms/",
"dwc:vernacularName": { "@container": "@language" }
}
],
"@type" : "Taxon",
"additionalType": "dwc:Taxon",
"taxonRank": ["species", "http://www.wikidata.org/entity/Q7432" ],
"name": "Delphinapterus leucas (Pallas, 1776)",
"alternateName": [ "Balaena albicans Muller, 1776", "Beluga catodon Gray, 1846" ],
"dwc:vernacularName": [
{ "@language": "en", "@value": "Beluga Whale" },
{ "@language": "fr", "@value": "Bélouga" }
],
"parentTaxon": {
"@type": "Taxon",
"name": "Delphinapterus Lacépède, 1804",
"mainEntityOfPage": "https://inpn.mnhn.fr/espece/cd_nom/191588?lg=en",
"taxonRank" : "genus"
},
"image": "https://inpn.mnhn.fr/photos/uploads/webtofs/inpn/3/181473.jpg"
}
</script>
14
Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France
Example markup of a page about taxon Delphinapterus leucas
<script type="application/ld+json">
{
...
"sameAs": [
"http://doris.ffessm.fr/Especes/Delphinapterus-leucas-Beluga-868",
"http://www.marinespecies.org/aphia.php?p=taxdetails&id=137115",
"http://www.iucnredlist.org/details/6335"
],
"identifier": [
"60932",
{ "@type": "PropertyValue",
"name": "WoRMS id",
"propertyID": "http://www.wikidata.org/entity/P850", # WoRMS id
"value": "137115"
}
],
}
</script>
15
Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France
What about names registries such as
IPNI,Zoobank, Mycobank?
Photo: https://commons.wikimedia.org/wiki/File:Name_label.JPG
16
Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France
TaxonName
Taxon vs. TaxonName, discussion:
https://github.com/BioSchemas/specifications/issues/309
Taxon
Type: https://bioschemas.org/types/TaxonName meant to become http://schema.org/TaxonName
Profile: https://bioschemas.org/profiles/TaxonName provides usage recommendations
17
Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France
Example markup of a page about name Delphinapterus leucas
<script type="application/ld+json">
{
"@context": "http://schema.org",
"@type" : "Taxon",
"taxonRank": "species“,
"scientificName": {
"@type" : "TaxonName",
"name": "Delphinapterus leucas",
"author": "(Pallas, 1776)",
"taxonRank": "species"
},
"alternateScientificName": [
{ "@type" : "TaxonName",
"name": "Balaena albicans",
"author": "Muller, 1776",
"taxonRank": "species"
},
{ "@type" : "TaxonName",
"name": "Beluga catodon",
"author": "Gray, 1846",
"taxonRank": "species"
}
]
}
</script>
<script type="application/ld+json">
{
"@context": "http://schema.org",
"@type" : "TaxonName",
"name": "Delphinapterus leucas",
"author": "(Pallas, 1776)",
"taxonRank": "species"
}
</script>
Taxon with TaxonName
TaxonName alone
18
Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France
Live
deployments
Photo: https://www.flickr.com/photos/35034363287@N01/2284904309
19
Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France
Early deployment at NMNH Paris
https://search.google.com/structured-data/testing-tool
180,000+ pages marked up with
Taxon & TaxonName types
https://inpn.mnhn.fr/espece/cd_nom/60932
20
Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France
NMNH Paris
Taxon & TaxonName, 180K pages
GBIF
Taxon & TaxonName, 3M pages
Scholia
Taxon, 2.7M pages
Scientific bibliographic information based on Wikidata
PIPPA
PSB Int. for Plant Phenotype Analysis
Taxon ↔ BioChemEntity
OpaleSurfCasting.net
Taxon
French leisure sea fishing legislation.
Why do early deployments matter?
• A way for the community to show its
interest in having these terms
• Necessary for Schema.org to endorse
new types
• First step to foster novel applications
(chicken & egg)
Early deployments https://bioschemas.org/liveDeploys/
https://scholia.toolforge.org
21
Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France
Next steps
22
Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France
Bioschemas work on biodiversity
Currently:
Taxon, TaxonName
Links to DwC terms
Future
Specimen
Links to ABCD, openDS, MIDS?
Traits
Links to traits ontologies?
Occurrence
Links to DwC occurrences?
…
https://bioschemas.org/groups/Biodiversity/
23
Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France
Marking up biodiversity resources… at scale
GBIF, EoL, CoL, iDigBio, DiSSCo…
Museum collections,
Literature (BHL, Plazi…),
Citizen science platforms,
Independent institutions,
Associations,
…
24
Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France
Marking up webpages Let’s have search engines
do the job for us!
• Connect pieces of data at web scale
• First step for data integration is discovery
• Dataset search engines
• What about a Species Search Engine?
• …
Take-aways
• Increases data visibility and discoverability
• Relatively inexpensive
• Connect unconnected pieces of data,
e.g. “grey literature”
Not the magic bullet
• Names discrepancies
• Compliance with nomenclature
• How to name taxonomic ranks
• …
25
Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France
https://bioschemas.org/
https://github.com/BioSchemas/specifications/wiki
Questions?

More Related Content

What's hot

Motivation, inspiration and innovation from frustration
Motivation, inspiration and innovation from frustrationMotivation, inspiration and innovation from frustration
Motivation, inspiration and innovation from frustration
Herbert Van de Sompel
 
towards interoperable archives: the Universal Preprint Service initiative
towards interoperable archives:  the Universal Preprint Service initiativetowards interoperable archives:  the Universal Preprint Service initiative
towards interoperable archives: the Universal Preprint Service initiative
Herbert Van de Sompel
 
Augmenting interoperability across scholarly repositories
Augmenting interoperability across scholarly repositoriesAugmenting interoperability across scholarly repositories
Augmenting interoperability across scholarly repositories
Herbert Van de Sompel
 
Open Archives Initiative Object Re-Use & Exchange
Open Archives Initiative Object Re-Use & ExchangeOpen Archives Initiative Object Re-Use & Exchange
Open Archives Initiative Object Re-Use & Exchange
Herbert Van de Sompel
 
#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeology#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeology
ekansa
 

What's hot (20)

Open Research Data: Licensing | Standards | Future
Open Research Data: Licensing | Standards | FutureOpen Research Data: Licensing | Standards | Future
Open Research Data: Licensing | Standards | Future
 
Museum impact: linking-up specimens with research published on them
Museum impact: linking-up specimens with research published on themMuseum impact: linking-up specimens with research published on them
Museum impact: linking-up specimens with research published on them
 
Motivation, inspiration and innovation from frustration
Motivation, inspiration and innovation from frustrationMotivation, inspiration and innovation from frustration
Motivation, inspiration and innovation from frustration
 
towards interoperable archives: the Universal Preprint Service initiative
towards interoperable archives:  the Universal Preprint Service initiativetowards interoperable archives:  the Universal Preprint Service initiative
towards interoperable archives: the Universal Preprint Service initiative
 
Augmenting interoperability across scholarly repositories
Augmenting interoperability across scholarly repositoriesAugmenting interoperability across scholarly repositories
Augmenting interoperability across scholarly repositories
 
Open Archives Initiative Object Re-Use & Exchange
Open Archives Initiative Object Re-Use & ExchangeOpen Archives Initiative Object Re-Use & Exchange
Open Archives Initiative Object Re-Use & Exchange
 
ContentMine + EPMC: Finding Zika!
ContentMine + EPMC: Finding Zika!ContentMine + EPMC: Finding Zika!
ContentMine + EPMC: Finding Zika!
 
Text and Data Mining explained at FTDM
Text and Data Mining explained at FTDMText and Data Mining explained at FTDM
Text and Data Mining explained at FTDM
 
ContentMine (TDM) at JISC Digifest
ContentMine (TDM) at JISC DigifestContentMine (TDM) at JISC Digifest
ContentMine (TDM) at JISC Digifest
 
ContentMining for France and Europe; Lessons from 2 years in UK
ContentMining for France and Europe; Lessons from 2 years in UKContentMining for France and Europe; Lessons from 2 years in UK
ContentMining for France and Europe; Lessons from 2 years in UK
 
ContentMine: Liberating scholarship from Open publications and theses
ContentMine: Liberating scholarship from Open publications and thesesContentMine: Liberating scholarship from Open publications and theses
ContentMine: Liberating scholarship from Open publications and theses
 
The Content Mine (presented at UKSG)
The Content Mine (presented at UKSG)The Content Mine (presented at UKSG)
The Content Mine (presented at UKSG)
 
The biodiversity informatics landscape: a systematics perspective
The biodiversity informatics landscape: a systematics perspectiveThe biodiversity informatics landscape: a systematics perspective
The biodiversity informatics landscape: a systematics perspective
 
Workshop 5: Uptake of, and concepts in text and data mining
Workshop 5: Uptake of, and concepts in text and data miningWorkshop 5: Uptake of, and concepts in text and data mining
Workshop 5: Uptake of, and concepts in text and data mining
 
The culture of researchData
The culture of researchData The culture of researchData
The culture of researchData
 
Open scholarship [a FOSTER open science talk]
Open scholarship [a FOSTER open science talk]Open scholarship [a FOSTER open science talk]
Open scholarship [a FOSTER open science talk]
 
Content Mining for Machines and Humans
Content Mining for Machines and HumansContent Mining for Machines and Humans
Content Mining for Machines and Humans
 
#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeology#LAWDI Open Context, publishing linked data in archaeology
#LAWDI Open Context, publishing linked data in archaeology
 
20140317 pi b_nmbe_journal_club
20140317 pi b_nmbe_journal_club20140317 pi b_nmbe_journal_club
20140317 pi b_nmbe_journal_club
 
Cochrane workshop2016
Cochrane workshop2016Cochrane workshop2016
Cochrane workshop2016
 

Similar to Bioschemas: Marking up biodiversity websites to improve data discovery and web-scale integration

Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
ICZN
 
Linq 2013 plenary_keynote_sicilia
Linq 2013 plenary_keynote_siciliaLinq 2013 plenary_keynote_sicilia
Linq 2013 plenary_keynote_sicilia
LINQ_Conference
 

Similar to Bioschemas: Marking up biodiversity websites to improve data discovery and web-scale integration (20)

Unleash the Potential of your Website! 180,000 webpages from the French NHM m...
Unleash the Potential of your Website! 180,000 webpages from the French NHM m...Unleash the Potential of your Website! 180,000 webpages from the French NHM m...
Unleash the Potential of your Website! 180,000 webpages from the French NHM m...
 
ISSA: Generic Pipeline, Knowledge Model and Visualization tools to Help Scien...
ISSA: Generic Pipeline, Knowledge Model and Visualization tools to Help Scien...ISSA: Generic Pipeline, Knowledge Model and Visualization tools to Help Scien...
ISSA: Generic Pipeline, Knowledge Model and Visualization tools to Help Scien...
 
Heterogeneous Data Aggregation and Querying at Web Scale Using Semantic align...
Heterogeneous Data Aggregation and Querying at Web Scale Using Semantic align...Heterogeneous Data Aggregation and Querying at Web Scale Using Semantic align...
Heterogeneous Data Aggregation and Querying at Web Scale Using Semantic align...
 
2 Discovery and Acquisition of Data1.pptx
2 Discovery and Acquisition of Data1.pptx2 Discovery and Acquisition of Data1.pptx
2 Discovery and Acquisition of Data1.pptx
 
Describe and Publish data sets on the web: vocabularies, catalogues, data por...
Describe and Publish data sets on the web: vocabularies, catalogues, data por...Describe and Publish data sets on the web: vocabularies, catalogues, data por...
Describe and Publish data sets on the web: vocabularies, catalogues, data por...
 
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
 
Ciard Initiative and a Global Infrastructure for Linked Open Data
Ciard Initiative and a Global Infrastructure for Linked Open Data Ciard Initiative and a Global Infrastructure for Linked Open Data
Ciard Initiative and a Global Infrastructure for Linked Open Data
 
Cornell 2011 05-13
Cornell 2011 05-13Cornell 2011 05-13
Cornell 2011 05-13
 
Metadata for Interoperable Bioscience
Metadata for Interoperable BioscienceMetadata for Interoperable Bioscience
Metadata for Interoperable Bioscience
 
The BlueBRIDGE approach to collaborative research
The BlueBRIDGE approach to collaborative researchThe BlueBRIDGE approach to collaborative research
The BlueBRIDGE approach to collaborative research
 
Scratchpads: Building web communities supporting biodiversity science
Scratchpads: Building web communities supporting biodiversity scienceScratchpads: Building web communities supporting biodiversity science
Scratchpads: Building web communities supporting biodiversity science
 
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
Yde de Jong & Dave Roberts - ZooBank and EDIT: Towards a business model for Z...
 
ContentMining for Synthetic Biology
ContentMining for Synthetic BiologyContentMining for Synthetic Biology
ContentMining for Synthetic Biology
 
ContentMining for Synthetic Biology
ContentMining for Synthetic BiologyContentMining for Synthetic Biology
ContentMining for Synthetic Biology
 
NFDI Physical Sciences Colloquium - FAIR
NFDI Physical Sciences Colloquium - FAIRNFDI Physical Sciences Colloquium - FAIR
NFDI Physical Sciences Colloquium - FAIR
 
Linq 2013 plenary_keynote_sicilia
Linq 2013 plenary_keynote_siciliaLinq 2013 plenary_keynote_sicilia
Linq 2013 plenary_keynote_sicilia
 
Ontology repositories and case study with OntoPortal
Ontology repositories and case study with OntoPortalOntology repositories and case study with OntoPortal
Ontology repositories and case study with OntoPortal
 
New trends in ontological engineering, practices and tools
New trends in ontological engineering, practices and toolsNew trends in ontological engineering, practices and tools
New trends in ontological engineering, practices and tools
 
IDB-Cloud Providing Bioinformatics Services on Cloud
IDB-Cloud Providing Bioinformatics Services on CloudIDB-Cloud Providing Bioinformatics Services on Cloud
IDB-Cloud Providing Bioinformatics Services on Cloud
 
World bank 2011-05
World bank 2011-05World bank 2011-05
World bank 2011-05
 

More from Franck Michel

SPARQL Micro-Services: Lightweight Integration of Web APIs and Linked Data
SPARQL Micro-Services: Lightweight Integration of Web APIs and Linked DataSPARQL Micro-Services: Lightweight Integration of Web APIs and Linked Data
SPARQL Micro-Services: Lightweight Integration of Web APIs and Linked Data
Franck Michel
 
Construction d’un référentiel taxonomique commun pour des études sur l’histoi...
Construction d’un référentiel taxonomique commun pour des études sur l’histoi...Construction d’un référentiel taxonomique commun pour des études sur l’histoi...
Construction d’un référentiel taxonomique commun pour des études sur l’histoi...
Franck Michel
 

More from Franck Michel (11)

Knowledge Engineering: Semantic web, web of data, linked data
Knowledge Engineering: Semantic web, web of data, linked dataKnowledge Engineering: Semantic web, web of data, linked data
Knowledge Engineering: Semantic web, web of data, linked data
 
Enabling Automatic Discovery and Querying of Web APIs at Web Scale using Link...
Enabling Automatic Discovery and Querying of Web APIs at Web Scale using Link...Enabling Automatic Discovery and Querying of Web APIs at Web Scale using Link...
Enabling Automatic Discovery and Querying of Web APIs at Web Scale using Link...
 
Modelling Biodiversity Linked Data: Pragmatism May Narrow Future Opportunities
Modelling Biodiversity Linked Data: Pragmatism May Narrow Future OpportunitiesModelling Biodiversity Linked Data: Pragmatism May Narrow Future Opportunities
Modelling Biodiversity Linked Data: Pragmatism May Narrow Future Opportunities
 
SPARQL Micro-Services: Lightweight Integration of Web APIs and Linked Data
SPARQL Micro-Services: Lightweight Integration of Web APIs and Linked DataSPARQL Micro-Services: Lightweight Integration of Web APIs and Linked Data
SPARQL Micro-Services: Lightweight Integration of Web APIs and Linked Data
 
Integrating Heterogeneous Data Sources in the Web of Data
Integrating Heterogeneous Data Sources in the Web of DataIntegrating Heterogeneous Data Sources in the Web of Data
Integrating Heterogeneous Data Sources in the Web of Data
 
Construction d’un référentiel taxonomique commun pour des études sur l’histoi...
Construction d’un référentiel taxonomique commun pour des études sur l’histoi...Construction d’un référentiel taxonomique commun pour des études sur l’histoi...
Construction d’un référentiel taxonomique commun pour des études sur l’histoi...
 
A Mapping-based Method to Query MongoDB Documents with SPARQL
A Mapping-based Method to Query MongoDB Documents with SPARQLA Mapping-based Method to Query MongoDB Documents with SPARQL
A Mapping-based Method to Query MongoDB Documents with SPARQL
 
A Generic Mapping-based Query Translation from SPARQL to Various Target Datab...
A Generic Mapping-based Query Translation from SPARQL to Various Target Datab...A Generic Mapping-based Query Translation from SPARQL to Various Target Datab...
A Generic Mapping-based Query Translation from SPARQL to Various Target Datab...
 
Make our Scientific Datasets Accessible and Interoperable on the Web
Make our Scientific Datasets Accessible and Interoperable on the WebMake our Scientific Datasets Accessible and Interoperable on the Web
Make our Scientific Datasets Accessible and Interoperable on the Web
 
Translation of Relational and Non-Relational Databases into RDF with xR2RML
Translation of Relational and Non-Relational Databases into RDF with xR2RMLTranslation of Relational and Non-Relational Databases into RDF with xR2RML
Translation of Relational and Non-Relational Databases into RDF with xR2RML
 
Towards a Shared Reference Thesaurus for Studies on History of Zoology, Archa...
Towards a Shared Reference Thesaurus for Studies on History of Zoology, Archa...Towards a Shared Reference Thesaurus for Studies on History of Zoology, Archa...
Towards a Shared Reference Thesaurus for Studies on History of Zoology, Archa...
 

Recently uploaded

Production 2024 sunderland culture final - Copy.pptx
Production 2024 sunderland culture final - Copy.pptxProduction 2024 sunderland culture final - Copy.pptx
Production 2024 sunderland culture final - Copy.pptx
ChloeMeadows1
 
audience research (emma) 1.pptxkkkkkkkkkkkkkkkkk
audience research (emma) 1.pptxkkkkkkkkkkkkkkkkkaudience research (emma) 1.pptxkkkkkkkkkkkkkkkkk
audience research (emma) 1.pptxkkkkkkkkkkkkkkkkk
lolsDocherty
 

Recently uploaded (17)

Statistical Analysis of DNS Latencies.pdf
Statistical Analysis of DNS Latencies.pdfStatistical Analysis of DNS Latencies.pdf
Statistical Analysis of DNS Latencies.pdf
 
I’ll See Y’All Motherfuckers In Game 7 Shirt
I’ll See Y’All Motherfuckers In Game 7 ShirtI’ll See Y’All Motherfuckers In Game 7 Shirt
I’ll See Y’All Motherfuckers In Game 7 Shirt
 
Production 2024 sunderland culture final - Copy.pptx
Production 2024 sunderland culture final - Copy.pptxProduction 2024 sunderland culture final - Copy.pptx
Production 2024 sunderland culture final - Copy.pptx
 
Registry Data Accuracy Improvements, presented by Chimi Dorji at SANOG 41 / I...
Registry Data Accuracy Improvements, presented by Chimi Dorji at SANOG 41 / I...Registry Data Accuracy Improvements, presented by Chimi Dorji at SANOG 41 / I...
Registry Data Accuracy Improvements, presented by Chimi Dorji at SANOG 41 / I...
 
Premier Mobile App Development Agency in USA.pdf
Premier Mobile App Development Agency in USA.pdfPremier Mobile App Development Agency in USA.pdf
Premier Mobile App Development Agency in USA.pdf
 
Development Lifecycle.pptx for the secure development of apps
Development Lifecycle.pptx for the secure development of appsDevelopment Lifecycle.pptx for the secure development of apps
Development Lifecycle.pptx for the secure development of apps
 
iThome_CYBERSEC2024_Drive_Into_the_DarkWeb
iThome_CYBERSEC2024_Drive_Into_the_DarkWebiThome_CYBERSEC2024_Drive_Into_the_DarkWeb
iThome_CYBERSEC2024_Drive_Into_the_DarkWeb
 
Cyber Security Services Unveiled: Strategies to Secure Your Digital Presence
Cyber Security Services Unveiled: Strategies to Secure Your Digital PresenceCyber Security Services Unveiled: Strategies to Secure Your Digital Presence
Cyber Security Services Unveiled: Strategies to Secure Your Digital Presence
 
GOOGLE Io 2024 At takes center stage.pdf
GOOGLE Io 2024 At takes center stage.pdfGOOGLE Io 2024 At takes center stage.pdf
GOOGLE Io 2024 At takes center stage.pdf
 
TORTOGEL TELAH MENJADI SALAH SATU PLATFORM PERMAINAN PALING FAVORIT.
TORTOGEL TELAH MENJADI SALAH SATU PLATFORM PERMAINAN PALING FAVORIT.TORTOGEL TELAH MENJADI SALAH SATU PLATFORM PERMAINAN PALING FAVORIT.
TORTOGEL TELAH MENJADI SALAH SATU PLATFORM PERMAINAN PALING FAVORIT.
 
AI Generated 3D Models | AI 3D Model Generator
AI Generated 3D Models | AI 3D Model GeneratorAI Generated 3D Models | AI 3D Model Generator
AI Generated 3D Models | AI 3D Model Generator
 
The Rise of Subscription-Based Digital Services.pdf
The Rise of Subscription-Based Digital Services.pdfThe Rise of Subscription-Based Digital Services.pdf
The Rise of Subscription-Based Digital Services.pdf
 
Thank You Luv I’ll Never Walk Alone Again T shirts
Thank You Luv I’ll Never Walk Alone Again T shirtsThank You Luv I’ll Never Walk Alone Again T shirts
Thank You Luv I’ll Never Walk Alone Again T shirts
 
Reggie miller choke t shirtsReggie miller choke t shirts
Reggie miller choke t shirtsReggie miller choke t shirtsReggie miller choke t shirtsReggie miller choke t shirts
Reggie miller choke t shirtsReggie miller choke t shirts
 
audience research (emma) 1.pptxkkkkkkkkkkkkkkkkk
audience research (emma) 1.pptxkkkkkkkkkkkkkkkkkaudience research (emma) 1.pptxkkkkkkkkkkkkkkkkk
audience research (emma) 1.pptxkkkkkkkkkkkkkkkkk
 
Free scottie t shirts Free scottie t shirts
Free scottie t shirts Free scottie t shirtsFree scottie t shirts Free scottie t shirts
Free scottie t shirts Free scottie t shirts
 
Bug Bounty Blueprint : A Beginner's Guide
Bug Bounty Blueprint : A Beginner's GuideBug Bounty Blueprint : A Beginner's Guide
Bug Bounty Blueprint : A Beginner's Guide
 

Bioschemas: Marking up biodiversity websites to improve data discovery and web-scale integration

  • 1. 1 Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France Bioschemas: Marking up biodiversity websites to improve data discovery and web-scale integration * Wimmics: AI in bridging social semantics and formal semantics on the Web TDWG Webinar, 2021-03-10 Franck MICHEL* Bioschemas Community http://bioschemas.org/people/
  • 2. 2 Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France Semantic markup for web pages
  • 3. 3 Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France
  • 4. 4 Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France Collaborative community project founded in 2011 by Define a common vocabulary to markup resources on the internet - Structured data makes resources understandable to search engines - Improve ranking, discoverability - Provide informative summarizations : semantic markup for resources on the internet schema.org Microdata RDFa Microformats Markup formats
  • 5. 5 Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France Collaborative community project founded in 2011 by Define a common vocabulary to markup resources on the internet - Structured data makes resources understandable to search engines - Improve ranking, discoverability - Provide informative summarizations Microdata RDFa : semantic markup for resources on the internet Microformats schema.org Source: https://w3techs.com/technologies/history_overview/structured_data/all Markup formats
  • 6. 6 Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France What we are talking about: types (778) What we can say about those things: properties (1369) : semantic markup for resources on the internet schema.org http://schema.org/Person
  • 7. 7 Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France Webpages How to share your biodiversity data? Web API Linked Data KG Integrative approach GBIF, EoL, iDigBio… simple sophisticated Flat files
  • 8. 8 Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France How to share your biodiversity data? Web API Linked Data KG Integrative approach GBIF, EoL, iDigBio… simple sophisticated Webpages Flat files
  • 9. 9 Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France Bioschemas: schema.org extension for LifeSciences Community initiative built on top of Schema.org Aim Help search engines understand and index webpages Improve resources discoverability and interoperability Approach Reuse/extend Schema.org for life sciences Keep it simple (no complex domain ontology) Provide guidelines on how to markup resources • Minimum/recommended/optional properties • Link to other vocabularies & domain ontologies Flexibility: recommandations, not constraints Support software Specification Data model Minimum information Controlled vocabularies Cardinality Documentation Examples New (properties | types)
  • 10. 10 Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France Currently defined terms ‒ ChemicalSubstance ‒ DataCatalog ‒ Dataset ‒ Gene ‒ MolecularEntity ‒ Protein ‒ Sample ‒ Taxon More terms to come ‒ BioSample ‒ ComputationalTool ‒ ComputationalWorkflow ‒ LabProtocol ‒ Phenotype ‒ ProteinStructure ‒ RNA ‒ TaxonName ‒ … Bioschemas: schema.org extension for LifeSciences
  • 11. 11 Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France Taxon Type: https://bioschemas.org/types/Taxon meant to become http://schema.org/Taxon Profile: https://bioschemas.org/profiles/Taxon provides usage recommendations dwc:vernacularName
  • 12. 12 Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France Example markup of a page about taxon Delphinapterus leucas <script type="application/ld+json"> { "@context": "http://schema.org", "@type" : "Taxon", "name": "Delphinapterus leucas (Pallas, 1776)", "taxonRank": "species" } </script>
  • 13. 13 Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France Example markup of a page about taxon Delphinapterus leucas <script type="application/ld+json"> { "@context": [ "http://schema.org", { "dwc": "http://rs.tdwg.org/dwc/terms/", "dwc:vernacularName": { "@container": "@language" } } ], "@type" : "Taxon", "additionalType": "dwc:Taxon", "taxonRank": ["species", "http://www.wikidata.org/entity/Q7432" ], "name": "Delphinapterus leucas (Pallas, 1776)", "alternateName": [ "Balaena albicans Muller, 1776", "Beluga catodon Gray, 1846" ], "dwc:vernacularName": [ { "@language": "en", "@value": "Beluga Whale" }, { "@language": "fr", "@value": "Bélouga" } ], "parentTaxon": { "@type": "Taxon", "name": "Delphinapterus Lacépède, 1804", "mainEntityOfPage": "https://inpn.mnhn.fr/espece/cd_nom/191588?lg=en", "taxonRank" : "genus" }, "image": "https://inpn.mnhn.fr/photos/uploads/webtofs/inpn/3/181473.jpg" } </script>
  • 14. 14 Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France Example markup of a page about taxon Delphinapterus leucas <script type="application/ld+json"> { ... "sameAs": [ "http://doris.ffessm.fr/Especes/Delphinapterus-leucas-Beluga-868", "http://www.marinespecies.org/aphia.php?p=taxdetails&id=137115", "http://www.iucnredlist.org/details/6335" ], "identifier": [ "60932", { "@type": "PropertyValue", "name": "WoRMS id", "propertyID": "http://www.wikidata.org/entity/P850", # WoRMS id "value": "137115" } ], } </script>
  • 15. 15 Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France What about names registries such as IPNI,Zoobank, Mycobank? Photo: https://commons.wikimedia.org/wiki/File:Name_label.JPG
  • 16. 16 Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France TaxonName Taxon vs. TaxonName, discussion: https://github.com/BioSchemas/specifications/issues/309 Taxon Type: https://bioschemas.org/types/TaxonName meant to become http://schema.org/TaxonName Profile: https://bioschemas.org/profiles/TaxonName provides usage recommendations
  • 17. 17 Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France Example markup of a page about name Delphinapterus leucas <script type="application/ld+json"> { "@context": "http://schema.org", "@type" : "Taxon", "taxonRank": "species“, "scientificName": { "@type" : "TaxonName", "name": "Delphinapterus leucas", "author": "(Pallas, 1776)", "taxonRank": "species" }, "alternateScientificName": [ { "@type" : "TaxonName", "name": "Balaena albicans", "author": "Muller, 1776", "taxonRank": "species" }, { "@type" : "TaxonName", "name": "Beluga catodon", "author": "Gray, 1846", "taxonRank": "species" } ] } </script> <script type="application/ld+json"> { "@context": "http://schema.org", "@type" : "TaxonName", "name": "Delphinapterus leucas", "author": "(Pallas, 1776)", "taxonRank": "species" } </script> Taxon with TaxonName TaxonName alone
  • 18. 18 Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France Live deployments Photo: https://www.flickr.com/photos/35034363287@N01/2284904309
  • 19. 19 Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France Early deployment at NMNH Paris https://search.google.com/structured-data/testing-tool 180,000+ pages marked up with Taxon & TaxonName types https://inpn.mnhn.fr/espece/cd_nom/60932
  • 20. 20 Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France NMNH Paris Taxon & TaxonName, 180K pages GBIF Taxon & TaxonName, 3M pages Scholia Taxon, 2.7M pages Scientific bibliographic information based on Wikidata PIPPA PSB Int. for Plant Phenotype Analysis Taxon ↔ BioChemEntity OpaleSurfCasting.net Taxon French leisure sea fishing legislation. Why do early deployments matter? • A way for the community to show its interest in having these terms • Necessary for Schema.org to endorse new types • First step to foster novel applications (chicken & egg) Early deployments https://bioschemas.org/liveDeploys/ https://scholia.toolforge.org
  • 21. 21 Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France Next steps
  • 22. 22 Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France Bioschemas work on biodiversity Currently: Taxon, TaxonName Links to DwC terms Future Specimen Links to ABCD, openDS, MIDS? Traits Links to traits ontologies? Occurrence Links to DwC occurrences? … https://bioschemas.org/groups/Biodiversity/
  • 23. 23 Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France Marking up biodiversity resources… at scale GBIF, EoL, CoL, iDigBio, DiSSCo… Museum collections, Literature (BHL, Plazi…), Citizen science platforms, Independent institutions, Associations, …
  • 24. 24 Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France Marking up webpages Let’s have search engines do the job for us! • Connect pieces of data at web scale • First step for data integration is discovery • Dataset search engines • What about a Species Search Engine? • … Take-aways • Increases data visibility and discoverability • Relatively inexpensive • Connect unconnected pieces of data, e.g. “grey literature” Not the magic bullet • Names discrepancies • Compliance with nomenclature • How to name taxonomic ranks • …
  • 25. 25 Franck MICHEL - Université Côte d’Azur, CNRS, Inria, I3S, France https://bioschemas.org/ https://github.com/BioSchemas/specifications/wiki Questions?