SlideShare a Scribd company logo
Bio2RDF Release 2: Improved coverage,
interoperability and provenance of Life
Science Linked Data
Linked Data for the Life Sciences
Alison Callahan1, Jose Cruz-Toledo1, Peter Ansell2, Michel Dumontier1
1Carleton University, 2University of Queensland
ESWC2013::Bio2RDF Release 21
ESWC2013::Bio2RDF Release 2
is an open source framework
on the emerging semantic web
to produce and provide biological linked data
that uses simple conventions
2
ESWC2013::Bio2RDF Release 2
reduces the time and effort
so that you can get to
involved in data integration
doing science
3
Main features of Bio2RDF Release 2
• Bio2RDF conversion scripts, mapping files and web
application are open source and freely available at
http://github.com/bio2rdf
• Bio2RDF enables (syntactic) data integration within and
across datasets by using one language (RDF), having a
common URI pattern and a common resource registry
• 19 Release 2 datasets have provenance and endpoints
feature pre-computed graph summaries for fast lookup
• Bio2RDF web application enables entity resolution, query
federation across an expandable distributed network of
SPARQL endpoints
ESWC2013::Bio2RDF Release 24
ESWC2013::Bio2RDF Release 2
At the heart of Linked Data for the Life Sciences
5
Bio2RDF data are identified using
simple http URI patterns
Bio2RDF data are identified by Internationalized Resource Identifiers
(IRIs) of the form:
• http://bio2rdf.org/namespace:identifier
for source data with an assigned identifier
• http://bio2rdf.org/namespace_resource:identifier
for source data without an assigned identifier
• http://bio2rdf.org/namespace_vocabulary:identifier
for dataset-specific types and relations
Where namespace comes from a curated registry of datasets, hence
enabling simple syntactic-based integration in and across datasets.
ESWC2013::Bio2RDF Release 26
Data are described through machine-
understandable statements
ESWC2013::Bio2RDF Release 2
drugbank:DB00650
drugbank_vocabulary:Drug
rdf:type
drugbank_resource:DB00440_DB00650
drugbank_vocabulary:Drug-Drug-Interaction
rdf:type
drugbank_vocabulary:ddi-interactor-in
rdfs:label
DDI between Trimethoprim and
Leucovorin [drugbank_resource:
DB00440_DB00650]
rdfs:label
Leucovorin [drugbank:DB00650]
7
The linked data network expands
with inter-dataset statements
ESWC2013::Bio2RDF Release 2
drugbank:DB00650
pharmgkb_vocabulary:Drug
rdf:type
rdfs:label
Leucovorin [drugbank:DB00650]
pharmgkb:PA450198
drugbank_vocabulary:Drug
pharmgkb_vocabulary:xref
leucovorin [pharmgkb:PA450198]
rdfs:label
DrugBank
PharmGKB
8
Linked Data: You can look it up
ESWC2013::Bio2RDF Release 29
You can get what links to it
ESWC2013::Bio2RDF Release 2
http://bio2rdf.org/linksns/drugbank/drugbank:DB00650
What links to DrugBank’s Leucovorin?
10
Every Bio2RDF dataset now contains
provenance metadata
ESWC2013::Bio2RDF Release 2
Features
- Entity-dataset link
- Creator
- Publisher
- Date created
- License & rights
- Source
- Availability
- SPARQL endpoint
- Data dump
Vocabularies
VoID
Dublin Core
W3C Provenance
Bio2RDF vocabulary
11
data item
Bio2RDF
dataset
Source
dataset
void:inDataset
prov:wasDerivedFrom
Dataset Namespace # of triples
Affymetrix affymetrix 44469611
Biomodels* biomodels 589753
Comparative Toxicogenomics Database ctd 141845167
DrugBank drugbank 1121468
NCBI Gene ncbigene 394026267
Gene Ontology Annotations goa 80028873
HUGO Gene Nomenclature Committee hgnc 836060
Homologene homologene 1281881
InterPro* interpro 999031
iProClass iproclass 211365460
iRefIndex irefindex 31042135
Medical Subject Headings mesh 4172230
NCBO BioPortal* bioportal 15384622
National Drug Code Directory* ndc 17814216
Online Mendelian Inheritance in Man omim 1848729
Pharmacogenomics Knowledge Base pharmgkb 37949275
SABIO-RK* sabiork 2618288
Saccharomyces Genome Database sgd 5551009
NCBI Taxonomy taxon 17814216
Total 19 1,010,758,291
Bio2RDF Release 2 – New and Updated Datasets
ESWC2013::Bio2RDF Release 212
Inter-dataset connectivity
ESWC2013::Bio2RDF Release 213
ESWC2013::Bio2RDF Release 214
The Wider Network of Bio2RDF Linked Data
ESWC2013::Bio2RDF Release 215
ESWC2013::Bio2RDF Release 216
ESWC2013::Bio2RDF Release 217
ESWC2013::Bio2RDF Release 218
Graph summaries in query formulation
ESWC2013::Bio2RDF Release 2
PREFIX drugbank_vocabulary: <http://bio2rdf.org/drugbank_vocabulary:>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?ddi ?d1name ?d2name
WHERE {
?ddi a drugbank_vocabulary:Drug-Drug-Interaction .
?d1 drugbank_vocabulary:ddi-interactor-in ?ddi .
?d1 rdfs:label ?d1name .
?d2 drugbank_vocabulary:ddi-interactor-in ?ddi .
?d2 rdfs:label ?d2name.
FILTER (?d1 != ?d2)
}
19
You can use the SPARQLed query
assistant with updated endpoints
ESWC2013::Bio2RDF Release 2
http://sindicetech.com/sindice-suite/sparqled/
graph: http://sindicetech.com/analytics20
Use virtuoso’s built in faceted
browser to construct increasingly
complex queries with little effort
ESWC2013::Bio2RDF Release 221
Federated Queries over independent
SPARQL endpoints
# get all biochemical reactions in biomodels that are kinds of "protein catabolic
process“, as defined by the gene ontology (in bioportal endpoint)
SPARQL Endpoint: http://bioportal.bio2rdf.org/sparql
SELECT ?go ?label count(distinct ?x)
WHERE {
?go rdfs:label ?label .
?go rdfs:subClassOf ?tgo OPTION (TRANSITIVE) .
?tgo rdfs:label ?tlabel .
FILTER regex(?tlabel, "^protein catabolic process")
service <http://biomodels.bio2rdf.org/sparql> {
?x <http://bio2rdf.org/biopax_vocabulary:identical-to> ?go .
?x a <http://www.biopax.org/release/biopax-level3.owl#BiochemicalReaction> .
} # end service
}
ESWC2013::Bio2RDF Release 222
Question: Find all proteins that interact with beta
amyloid
SELECT * WHERE {
?protein a bio2rdf:Protein .
?protein bio2rdf:interacts_with bio2rdf:beta-amyloid
.
}
Heterogeneous biological data on the
semantic web is difficult to query
UniProt Protein PDB Protein
iRefIndex Protein
?
Physical interaction?
Pathway interaction?
Genetic interaction?
ESWC2013::Bio2RDF Release 223
ontology as a
strategy to formally
represent and
integrate knowledge
ESWC2013::Bio2RDF Release 224
uniprot:P05067
uniprot:Protein
is a
sio:gene
is a is a
Semantic data integration, consistency checking and
query answering over Bio2RDF with the
Semanticscience Integrated Ontology (SIO)
dataset
ontology
Knowledge Base
ESWC2013::Bio2RDF Release 2
pharmgkb:PA30917
refseq:Protein
is a
is a
omim:189931
omim:Gene pharmgkb:Gene
Querying Bio2RDF Linked Open Data with a Global Schema. Alison Callahan, José Cruz-Toledo and
Michel Dumontier. Bio-ontologies 2012.
25
ESWC2013::Bio2RDF Release 226
SRIQ(D)
10700+ axioms
1300+ classes
201 object properties (inc. inverses)
1 datatype property
Bio2RDF and SIO powered SPARQL 1.1 federated query:
Find chemicals in CTD and proteins in SGD that participate
in the same GO process
SELECT ?chem, ?prot, ?proc
FROM <http://bio2rdf.org/ctd>
WHERE {
?chemical a sio:chemical-entity.
?chemical rdfs:label ?chem.
?chemical sio:is-participant-in ?process.
?process rdfs:label ?proc.
FILTER regex (?process, "http://bio2rdf.org/go:")
SERVICE <http://sgd.bio2rdf.org/sparql> {
?protein a sio:protein .
?protein sio:is-participant-in ?process.
?protein rdfs:label ?prot .
}
}
ESWC2013::Bio2RDF Release 227
Bio2RDF RDFization guidelines are
available at our wiki
https://github.com/bio2rdf/bio2rdf-scripts/wiki/RDFization-Guide-v1.1
ESWC2013::Bio2RDF Release 228
What the future holds
• Aiming for twice yearly release schedule. Next release will
include large datasets (>15B RefSeq, Genbank, PubMed, PDB)
• Working with identifiers.org to create a common registry with
2200 entries
• Spiking in identifiers.org and original data uris, where
available
• Consolidating provenance/metrics with OpenPHACTS
• Incorporate W3C Linking Open Drug Data (LODD) effort
– OMIM (released), SIDER (beta), CHEMBL (beta), LinkedCT
(beta), DailyMed (RDF available), TCM (RDF available),
• Extended dataset coverage by tapping into existing endpoints
(uniprot, bioportal, ebi-rdf?)
• Showcase with other third party tools
ESWC2013::Bio2RDF Release 229
Bio2RDF Release 2 – A summary
• Updated data conversion source code to use PHP API (all
available through GitHub)
– http://github.com/bio2rdf/bio2rdf-scripts
– https://github.com/bio2rdf/bio2rdf-scripts/wiki/RDFization-
Guide-v1.1
• Simple Bio2RDF IRI design patterns that facilitate syntactic
consistency and interoperability backed by simple registry
• Dataset provenance and metrics
• We welcome comments, suggestions and contributions
• Join our mailing list at bio2rdf@googlegroups.com
ESWC2013::Bio2RDF Release 230
Acknowledgements
Bio2RDF Release 2
Allison Callahan, Jose Cruz-Toledo, Peter Ansell
Bio2RDF
Francois Belleau, Marc-Alexandre Nolin
Alex De Leon, Steve Etlinger, Nichealla Keath
Jacques Corbeil, James Hogan Jean Morissette, Nicole
Tourigny, Philippe Rigault and Paul Roe
ESWC2013::Bio2RDF Release 231
dumontierlab.com
michel_dumontier@carleton.ca
ESWC2013::Bio2RDF Release 2
Website: http://dumontierlab.com
Presentations: http://slideshare.com/micheldumontier
32

More Related Content

Viewers also liked

Lee.Portfolio 09
Lee.Portfolio 09Lee.Portfolio 09
Lee.Portfolio 09
leesalcone
 
Dokumentacia sommeliers seminars1
Dokumentacia sommeliers seminars1Dokumentacia sommeliers seminars1
Dokumentacia sommeliers seminars1Nural Tataoglu
 
Les xarxes socials
Les xarxes socialsLes xarxes socials
Les xarxes socials
Jeroni Navarrete Barceló
 
powerpoint
powerpointpowerpoint
powerpoint
Uruguayo
 
Clases de noviembre 5 ños 1
Clases de noviembre 5 ños 1Clases de noviembre 5 ños 1
Clases de noviembre 5 ños 1
Silvana Esmilda Castillo Fernandez
 
Laporan observasi rpp dan laboratorium
Laporan observasi rpp dan laboratoriumLaporan observasi rpp dan laboratorium
Laporan observasi rpp dan laboratorium
Samantars17
 
Baby talk 2
Baby talk 2Baby talk 2
Baby talk 2
Cristina Arnaboldi
 
Exposicion de financiamientos
Exposicion de financiamientosExposicion de financiamientos
Exposicion de financiamientos
Alita Orpe
 
Great quotes of bruce lee
Great quotes of bruce leeGreat quotes of bruce lee
εργασία: κριτήρια επιλογής, συνέπειες βιομηχανικής επανάστασης
εργασία: κριτήρια επιλογής, συνέπειες βιομηχανικής επανάστασηςεργασία: κριτήρια επιλογής, συνέπειες βιομηχανικής επανάστασης
εργασία: κριτήρια επιλογής, συνέπειες βιομηχανικής επανάστασης
πεντάλ σχολικό
 
Understanding Android Handling of Touch Events
Understanding Android Handling of Touch EventsUnderstanding Android Handling of Touch Events
Understanding Android Handling of Touch Events
jensmohr
 
Mobile note mobile by MAdvertise et Bemobee
Mobile note mobile by MAdvertise et BemobeeMobile note mobile by MAdvertise et Bemobee
Mobile note mobile by MAdvertise et Bemobee
Franck Deville
 
Comparison of Jacob's Creek, Rosemount Estate, McGuigan Wines, Lindeman's and...
Comparison of Jacob's Creek, Rosemount Estate, McGuigan Wines, Lindeman's and...Comparison of Jacob's Creek, Rosemount Estate, McGuigan Wines, Lindeman's and...
Comparison of Jacob's Creek, Rosemount Estate, McGuigan Wines, Lindeman's and...
Unmetric
 
(151105)영화속 와인 이야기 v2
(151105)영화속 와인 이야기   v2(151105)영화속 와인 이야기   v2
(151105)영화속 와인 이야기 v2
휘웅 정
 
Symfony day 2016
Symfony day 2016Symfony day 2016
Symfony day 2016
Samuele Lilli
 
Conductores electricos
Conductores electricosConductores electricos
Conductores electricos
Karen Criado Vargas
 
Short Intro to Android Fragments
Short Intro to Android FragmentsShort Intro to Android Fragments
Short Intro to Android FragmentsJussi Pohjolainen
 

Viewers also liked (18)

Lee.Portfolio 09
Lee.Portfolio 09Lee.Portfolio 09
Lee.Portfolio 09
 
Dokumentacia sommeliers seminars1
Dokumentacia sommeliers seminars1Dokumentacia sommeliers seminars1
Dokumentacia sommeliers seminars1
 
Les xarxes socials
Les xarxes socialsLes xarxes socials
Les xarxes socials
 
powerpoint
powerpointpowerpoint
powerpoint
 
resume
resumeresume
resume
 
Clases de noviembre 5 ños 1
Clases de noviembre 5 ños 1Clases de noviembre 5 ños 1
Clases de noviembre 5 ños 1
 
Laporan observasi rpp dan laboratorium
Laporan observasi rpp dan laboratoriumLaporan observasi rpp dan laboratorium
Laporan observasi rpp dan laboratorium
 
Baby talk 2
Baby talk 2Baby talk 2
Baby talk 2
 
Exposicion de financiamientos
Exposicion de financiamientosExposicion de financiamientos
Exposicion de financiamientos
 
Great quotes of bruce lee
Great quotes of bruce leeGreat quotes of bruce lee
Great quotes of bruce lee
 
εργασία: κριτήρια επιλογής, συνέπειες βιομηχανικής επανάστασης
εργασία: κριτήρια επιλογής, συνέπειες βιομηχανικής επανάστασηςεργασία: κριτήρια επιλογής, συνέπειες βιομηχανικής επανάστασης
εργασία: κριτήρια επιλογής, συνέπειες βιομηχανικής επανάστασης
 
Understanding Android Handling of Touch Events
Understanding Android Handling of Touch EventsUnderstanding Android Handling of Touch Events
Understanding Android Handling of Touch Events
 
Mobile note mobile by MAdvertise et Bemobee
Mobile note mobile by MAdvertise et BemobeeMobile note mobile by MAdvertise et Bemobee
Mobile note mobile by MAdvertise et Bemobee
 
Comparison of Jacob's Creek, Rosemount Estate, McGuigan Wines, Lindeman's and...
Comparison of Jacob's Creek, Rosemount Estate, McGuigan Wines, Lindeman's and...Comparison of Jacob's Creek, Rosemount Estate, McGuigan Wines, Lindeman's and...
Comparison of Jacob's Creek, Rosemount Estate, McGuigan Wines, Lindeman's and...
 
(151105)영화속 와인 이야기 v2
(151105)영화속 와인 이야기   v2(151105)영화속 와인 이야기   v2
(151105)영화속 와인 이야기 v2
 
Symfony day 2016
Symfony day 2016Symfony day 2016
Symfony day 2016
 
Conductores electricos
Conductores electricosConductores electricos
Conductores electricos
 
Short Intro to Android Fragments
Short Intro to Android FragmentsShort Intro to Android Fragments
Short Intro to Android Fragments
 

Similar to 2013 eswc-bio2rdf-r2

2016 bmdid-mappings
2016 bmdid-mappings2016 bmdid-mappings
2016 bmdid-mappings
Michel Dumontier
 
Producing, publishing and consuming linked data - CSHALS 2013
Producing, publishing and consuming linked data - CSHALS 2013Producing, publishing and consuming linked data - CSHALS 2013
Producing, publishing and consuming linked data - CSHALS 2013
François Belleau
 
Querying Bio2RDF data
Querying Bio2RDF dataQuerying Bio2RDF data
Querying Bio2RDF data
alison.callahan
 
SADI CSHALS 2013
SADI CSHALS 2013SADI CSHALS 2013
SADI CSHALS 2013
Mark Wilkinson
 
Bio2RDF Release 2: Improved coverage, interoperability and provenance of Link...
Bio2RDF Release 2: Improved coverage, interoperability and provenance of Link...Bio2RDF Release 2: Improved coverage, interoperability and provenance of Link...
Bio2RDF Release 2: Improved coverage, interoperability and provenance of Link...Michel Dumontier
 
Bio2RDF@BH2010
Bio2RDF@BH2010Bio2RDF@BH2010
Bio2RDF@BH2010
François Belleau
 
Presentation forpd bj_1
Presentation forpd bj_1Presentation forpd bj_1
Presentation forpd bj_1Maori Ito
 
Chem2bio2rdf portal
Chem2bio2rdf portalChem2bio2rdf portal
Chem2bio2rdf portal
Bin Chen
 
How can you access PubChem programmatically?
How can you access PubChem programmatically?How can you access PubChem programmatically?
How can you access PubChem programmatically?
Sunghwan Kim
 
Web services for sharing germplasm data sets, at FAO in Rome (2006)
Web services for sharing germplasm data sets, at FAO in Rome (2006)Web services for sharing germplasm data sets, at FAO in Rome (2006)
Web services for sharing germplasm data sets, at FAO in Rome (2006)
Dag Endresen
 
GBIF web services for biodiversity data, for USDA GRIN, Washington DC, USA (2...
GBIF web services for biodiversity data, for USDA GRIN, Washington DC, USA (2...GBIF web services for biodiversity data, for USDA GRIN, Washington DC, USA (2...
GBIF web services for biodiversity data, for USDA GRIN, Washington DC, USA (2...
Dag Endresen
 
Use of open_linked_data_in_bioinformatics
Use of open_linked_data_in_bioinformaticsUse of open_linked_data_in_bioinformatics
Use of open_linked_data_in_bioinformaticsRemzi Çelebi
 
Gene Ontology Network Enrichment Analysis
Gene Ontology Network Enrichment AnalysisGene Ontology Network Enrichment Analysis
Gene Ontology Network Enrichment Analysis
UC Davis
 
Overview of Next Gen Sequencing Data Analysis
Overview of Next Gen Sequencing Data AnalysisOverview of Next Gen Sequencing Data Analysis
Overview of Next Gen Sequencing Data Analysis
Bioinformatics and Computational Biosciences Branch
 
Towards semantic systems chemical biology
Towards semantic systems chemical biology Towards semantic systems chemical biology
Towards semantic systems chemical biology
Bin Chen
 
BioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
BioThings API: Building a FAIR API Ecosystem for Biomedical KnowledgeBioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
BioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
Chunlei Wu
 
TDWG and GBIF, at European genbank network meeting (Bonn, April 2004)
TDWG and GBIF, at European genbank network meeting (Bonn, April 2004)TDWG and GBIF, at European genbank network meeting (Bonn, April 2004)
TDWG and GBIF, at European genbank network meeting (Bonn, April 2004)
Dag Endresen
 
Presentation at the EMBL-EBI Industry RDF meeting
Presentation at the EMBL-EBI  Industry RDF meetingPresentation at the EMBL-EBI  Industry RDF meeting
Presentation at the EMBL-EBI Industry RDF meeting
Johannes Keizer
 
AGROVOC, AGRIS and the CIARD RING, using RDF vocabularies and technologies f...
AGROVOC, AGRIS and the CIARD RING,  using RDF vocabularies and technologies f...AGROVOC, AGRIS and the CIARD RING,  using RDF vocabularies and technologies f...
AGROVOC, AGRIS and the CIARD RING, using RDF vocabularies and technologies f...
AIMS (Agricultural Information Management Standards)
 

Similar to 2013 eswc-bio2rdf-r2 (20)

Bio2RDF @ W3C HCLS2009
Bio2RDF @ W3C HCLS2009Bio2RDF @ W3C HCLS2009
Bio2RDF @ W3C HCLS2009
 
2016 bmdid-mappings
2016 bmdid-mappings2016 bmdid-mappings
2016 bmdid-mappings
 
Producing, publishing and consuming linked data - CSHALS 2013
Producing, publishing and consuming linked data - CSHALS 2013Producing, publishing and consuming linked data - CSHALS 2013
Producing, publishing and consuming linked data - CSHALS 2013
 
Querying Bio2RDF data
Querying Bio2RDF dataQuerying Bio2RDF data
Querying Bio2RDF data
 
SADI CSHALS 2013
SADI CSHALS 2013SADI CSHALS 2013
SADI CSHALS 2013
 
Bio2RDF Release 2: Improved coverage, interoperability and provenance of Link...
Bio2RDF Release 2: Improved coverage, interoperability and provenance of Link...Bio2RDF Release 2: Improved coverage, interoperability and provenance of Link...
Bio2RDF Release 2: Improved coverage, interoperability and provenance of Link...
 
Bio2RDF@BH2010
Bio2RDF@BH2010Bio2RDF@BH2010
Bio2RDF@BH2010
 
Presentation forpd bj_1
Presentation forpd bj_1Presentation forpd bj_1
Presentation forpd bj_1
 
Chem2bio2rdf portal
Chem2bio2rdf portalChem2bio2rdf portal
Chem2bio2rdf portal
 
How can you access PubChem programmatically?
How can you access PubChem programmatically?How can you access PubChem programmatically?
How can you access PubChem programmatically?
 
Web services for sharing germplasm data sets, at FAO in Rome (2006)
Web services for sharing germplasm data sets, at FAO in Rome (2006)Web services for sharing germplasm data sets, at FAO in Rome (2006)
Web services for sharing germplasm data sets, at FAO in Rome (2006)
 
GBIF web services for biodiversity data, for USDA GRIN, Washington DC, USA (2...
GBIF web services for biodiversity data, for USDA GRIN, Washington DC, USA (2...GBIF web services for biodiversity data, for USDA GRIN, Washington DC, USA (2...
GBIF web services for biodiversity data, for USDA GRIN, Washington DC, USA (2...
 
Use of open_linked_data_in_bioinformatics
Use of open_linked_data_in_bioinformaticsUse of open_linked_data_in_bioinformatics
Use of open_linked_data_in_bioinformatics
 
Gene Ontology Network Enrichment Analysis
Gene Ontology Network Enrichment AnalysisGene Ontology Network Enrichment Analysis
Gene Ontology Network Enrichment Analysis
 
Overview of Next Gen Sequencing Data Analysis
Overview of Next Gen Sequencing Data AnalysisOverview of Next Gen Sequencing Data Analysis
Overview of Next Gen Sequencing Data Analysis
 
Towards semantic systems chemical biology
Towards semantic systems chemical biology Towards semantic systems chemical biology
Towards semantic systems chemical biology
 
BioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
BioThings API: Building a FAIR API Ecosystem for Biomedical KnowledgeBioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
BioThings API: Building a FAIR API Ecosystem for Biomedical Knowledge
 
TDWG and GBIF, at European genbank network meeting (Bonn, April 2004)
TDWG and GBIF, at European genbank network meeting (Bonn, April 2004)TDWG and GBIF, at European genbank network meeting (Bonn, April 2004)
TDWG and GBIF, at European genbank network meeting (Bonn, April 2004)
 
Presentation at the EMBL-EBI Industry RDF meeting
Presentation at the EMBL-EBI  Industry RDF meetingPresentation at the EMBL-EBI  Industry RDF meeting
Presentation at the EMBL-EBI Industry RDF meeting
 
AGROVOC, AGRIS and the CIARD RING, using RDF vocabularies and technologies f...
AGROVOC, AGRIS and the CIARD RING,  using RDF vocabularies and technologies f...AGROVOC, AGRIS and the CIARD RING,  using RDF vocabularies and technologies f...
AGROVOC, AGRIS and the CIARD RING, using RDF vocabularies and technologies f...
 

More from Michel Dumontier

FAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable PredictionsFAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable Predictions
Michel Dumontier
 
A metadata standard for Knowledge Graphs
A metadata standard for Knowledge GraphsA metadata standard for Knowledge Graphs
A metadata standard for Knowledge Graphs
Michel Dumontier
 
Data-Driven Discovery Science with FAIR Knowledge Graphs
Data-Driven Discovery Science with FAIR Knowledge GraphsData-Driven Discovery Science with FAIR Knowledge Graphs
Data-Driven Discovery Science with FAIR Knowledge Graphs
Michel Dumontier
 
Evaluating FAIRness
Evaluating FAIRnessEvaluating FAIRness
Evaluating FAIRness
Michel Dumontier
 
The Role of the FAIR Guiding Principles for an effective Learning Health System
The Role of the FAIR Guiding Principles for an effective Learning Health SystemThe Role of the FAIR Guiding Principles for an effective Learning Health System
The Role of the FAIR Guiding Principles for an effective Learning Health System
Michel Dumontier
 
CIKM2020 Keynote: Accelerating discovery science with an Internet of FAIR dat...
CIKM2020 Keynote: Accelerating discovery science with an Internet of FAIR dat...CIKM2020 Keynote: Accelerating discovery science with an Internet of FAIR dat...
CIKM2020 Keynote: Accelerating discovery science with an Internet of FAIR dat...
Michel Dumontier
 
The role of the FAIR Guiding Principles in a Learning Health System
The role of the FAIR Guiding Principles in a Learning Health SystemThe role of the FAIR Guiding Principles in a Learning Health System
The role of the FAIR Guiding Principles in a Learning Health System
Michel Dumontier
 
Acclerating biomedical discovery with an internet of FAIR data and services -...
Acclerating biomedical discovery with an internet of FAIR data and services -...Acclerating biomedical discovery with an internet of FAIR data and services -...
Acclerating biomedical discovery with an internet of FAIR data and services -...
Michel Dumontier
 
Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...
Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...
Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...
Michel Dumontier
 
Are we FAIR yet? And will it be worth it?
Are we FAIR yet? And will it be worth it?Are we FAIR yet? And will it be worth it?
Are we FAIR yet? And will it be worth it?
Michel Dumontier
 
The Future of FAIR Data: An international social, legal and technological inf...
The Future of FAIR Data: An international social, legal and technological inf...The Future of FAIR Data: An international social, legal and technological inf...
The Future of FAIR Data: An international social, legal and technological inf...
Michel Dumontier
 
Keynote at the 2018 Maastricht University Dinner
Keynote at the 2018 Maastricht University DinnerKeynote at the 2018 Maastricht University Dinner
Keynote at the 2018 Maastricht University Dinner
Michel Dumontier
 
The future of science and business - a UM Star Lecture
The future of science and business - a UM Star LectureThe future of science and business - a UM Star Lecture
The future of science and business - a UM Star Lecture
Michel Dumontier
 
Are we FAIR yet?
Are we FAIR yet?Are we FAIR yet?
Are we FAIR yet?
Michel Dumontier
 
Developing and assessing FAIR digital resources
Developing and assessing FAIR digital resourcesDeveloping and assessing FAIR digital resources
Developing and assessing FAIR digital resources
Michel Dumontier
 
Advancing Biomedical Knowledge Reuse with FAIR
Advancing Biomedical Knowledge Reuse with FAIRAdvancing Biomedical Knowledge Reuse with FAIR
Advancing Biomedical Knowledge Reuse with FAIR
Michel Dumontier
 
A Framework to develop the FAIR Metrics
A Framework to develop the FAIR MetricsA Framework to develop the FAIR Metrics
A Framework to develop the FAIR Metrics
Michel Dumontier
 
FAIR principles and metrics for evaluation
FAIR principles and metrics for evaluationFAIR principles and metrics for evaluation
FAIR principles and metrics for evaluation
Michel Dumontier
 
Towards metrics to assess and encourage FAIRness
Towards metrics to assess and encourage FAIRnessTowards metrics to assess and encourage FAIRness
Towards metrics to assess and encourage FAIRness
Michel Dumontier
 
Data Science for the Win
Data Science for the WinData Science for the Win
Data Science for the Win
Michel Dumontier
 

More from Michel Dumontier (20)

FAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable PredictionsFAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable Predictions
 
A metadata standard for Knowledge Graphs
A metadata standard for Knowledge GraphsA metadata standard for Knowledge Graphs
A metadata standard for Knowledge Graphs
 
Data-Driven Discovery Science with FAIR Knowledge Graphs
Data-Driven Discovery Science with FAIR Knowledge GraphsData-Driven Discovery Science with FAIR Knowledge Graphs
Data-Driven Discovery Science with FAIR Knowledge Graphs
 
Evaluating FAIRness
Evaluating FAIRnessEvaluating FAIRness
Evaluating FAIRness
 
The Role of the FAIR Guiding Principles for an effective Learning Health System
The Role of the FAIR Guiding Principles for an effective Learning Health SystemThe Role of the FAIR Guiding Principles for an effective Learning Health System
The Role of the FAIR Guiding Principles for an effective Learning Health System
 
CIKM2020 Keynote: Accelerating discovery science with an Internet of FAIR dat...
CIKM2020 Keynote: Accelerating discovery science with an Internet of FAIR dat...CIKM2020 Keynote: Accelerating discovery science with an Internet of FAIR dat...
CIKM2020 Keynote: Accelerating discovery science with an Internet of FAIR dat...
 
The role of the FAIR Guiding Principles in a Learning Health System
The role of the FAIR Guiding Principles in a Learning Health SystemThe role of the FAIR Guiding Principles in a Learning Health System
The role of the FAIR Guiding Principles in a Learning Health System
 
Acclerating biomedical discovery with an internet of FAIR data and services -...
Acclerating biomedical discovery with an internet of FAIR data and services -...Acclerating biomedical discovery with an internet of FAIR data and services -...
Acclerating biomedical discovery with an internet of FAIR data and services -...
 
Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...
Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...
Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...
 
Are we FAIR yet? And will it be worth it?
Are we FAIR yet? And will it be worth it?Are we FAIR yet? And will it be worth it?
Are we FAIR yet? And will it be worth it?
 
The Future of FAIR Data: An international social, legal and technological inf...
The Future of FAIR Data: An international social, legal and technological inf...The Future of FAIR Data: An international social, legal and technological inf...
The Future of FAIR Data: An international social, legal and technological inf...
 
Keynote at the 2018 Maastricht University Dinner
Keynote at the 2018 Maastricht University DinnerKeynote at the 2018 Maastricht University Dinner
Keynote at the 2018 Maastricht University Dinner
 
The future of science and business - a UM Star Lecture
The future of science and business - a UM Star LectureThe future of science and business - a UM Star Lecture
The future of science and business - a UM Star Lecture
 
Are we FAIR yet?
Are we FAIR yet?Are we FAIR yet?
Are we FAIR yet?
 
Developing and assessing FAIR digital resources
Developing and assessing FAIR digital resourcesDeveloping and assessing FAIR digital resources
Developing and assessing FAIR digital resources
 
Advancing Biomedical Knowledge Reuse with FAIR
Advancing Biomedical Knowledge Reuse with FAIRAdvancing Biomedical Knowledge Reuse with FAIR
Advancing Biomedical Knowledge Reuse with FAIR
 
A Framework to develop the FAIR Metrics
A Framework to develop the FAIR MetricsA Framework to develop the FAIR Metrics
A Framework to develop the FAIR Metrics
 
FAIR principles and metrics for evaluation
FAIR principles and metrics for evaluationFAIR principles and metrics for evaluation
FAIR principles and metrics for evaluation
 
Towards metrics to assess and encourage FAIRness
Towards metrics to assess and encourage FAIRnessTowards metrics to assess and encourage FAIRness
Towards metrics to assess and encourage FAIRness
 
Data Science for the Win
Data Science for the WinData Science for the Win
Data Science for the Win
 

Recently uploaded

The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
Delapenabediema
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
siemaillard
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
Celine George
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
kaushalkr1407
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
EverAndrsGuerraGuerr
 
678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf
CarlosHernanMontoyab2
 
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Atul Kumar Singh
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
Balvir Singh
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
Jisc
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345
beazzy04
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
Atul Kumar Singh
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
MIRIAMSALINAS13
 
Acetabularia Information For Class 9 .docx
Acetabularia Information For Class 9  .docxAcetabularia Information For Class 9  .docx
Acetabularia Information For Class 9 .docx
vaibhavrinwa19
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
Pavel ( NSTU)
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
DhatriParmar
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
Levi Shapiro
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
Tamralipta Mahavidyalaya
 
Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
timhan337
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
GeoBlogs
 

Recently uploaded (20)

The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
 
678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf678020731-Sumas-y-Restas-Para-Colorear.pdf
678020731-Sumas-y-Restas-Para-Colorear.pdf
 
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
 
Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
 
Acetabularia Information For Class 9 .docx
Acetabularia Information For Class 9  .docxAcetabularia Information For Class 9  .docx
Acetabularia Information For Class 9 .docx
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
 
Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
 

2013 eswc-bio2rdf-r2

  • 1. Bio2RDF Release 2: Improved coverage, interoperability and provenance of Life Science Linked Data Linked Data for the Life Sciences Alison Callahan1, Jose Cruz-Toledo1, Peter Ansell2, Michel Dumontier1 1Carleton University, 2University of Queensland ESWC2013::Bio2RDF Release 21
  • 2. ESWC2013::Bio2RDF Release 2 is an open source framework on the emerging semantic web to produce and provide biological linked data that uses simple conventions 2
  • 3. ESWC2013::Bio2RDF Release 2 reduces the time and effort so that you can get to involved in data integration doing science 3
  • 4. Main features of Bio2RDF Release 2 • Bio2RDF conversion scripts, mapping files and web application are open source and freely available at http://github.com/bio2rdf • Bio2RDF enables (syntactic) data integration within and across datasets by using one language (RDF), having a common URI pattern and a common resource registry • 19 Release 2 datasets have provenance and endpoints feature pre-computed graph summaries for fast lookup • Bio2RDF web application enables entity resolution, query federation across an expandable distributed network of SPARQL endpoints ESWC2013::Bio2RDF Release 24
  • 5. ESWC2013::Bio2RDF Release 2 At the heart of Linked Data for the Life Sciences 5
  • 6. Bio2RDF data are identified using simple http URI patterns Bio2RDF data are identified by Internationalized Resource Identifiers (IRIs) of the form: • http://bio2rdf.org/namespace:identifier for source data with an assigned identifier • http://bio2rdf.org/namespace_resource:identifier for source data without an assigned identifier • http://bio2rdf.org/namespace_vocabulary:identifier for dataset-specific types and relations Where namespace comes from a curated registry of datasets, hence enabling simple syntactic-based integration in and across datasets. ESWC2013::Bio2RDF Release 26
  • 7. Data are described through machine- understandable statements ESWC2013::Bio2RDF Release 2 drugbank:DB00650 drugbank_vocabulary:Drug rdf:type drugbank_resource:DB00440_DB00650 drugbank_vocabulary:Drug-Drug-Interaction rdf:type drugbank_vocabulary:ddi-interactor-in rdfs:label DDI between Trimethoprim and Leucovorin [drugbank_resource: DB00440_DB00650] rdfs:label Leucovorin [drugbank:DB00650] 7
  • 8. The linked data network expands with inter-dataset statements ESWC2013::Bio2RDF Release 2 drugbank:DB00650 pharmgkb_vocabulary:Drug rdf:type rdfs:label Leucovorin [drugbank:DB00650] pharmgkb:PA450198 drugbank_vocabulary:Drug pharmgkb_vocabulary:xref leucovorin [pharmgkb:PA450198] rdfs:label DrugBank PharmGKB 8
  • 9. Linked Data: You can look it up ESWC2013::Bio2RDF Release 29
  • 10. You can get what links to it ESWC2013::Bio2RDF Release 2 http://bio2rdf.org/linksns/drugbank/drugbank:DB00650 What links to DrugBank’s Leucovorin? 10
  • 11. Every Bio2RDF dataset now contains provenance metadata ESWC2013::Bio2RDF Release 2 Features - Entity-dataset link - Creator - Publisher - Date created - License & rights - Source - Availability - SPARQL endpoint - Data dump Vocabularies VoID Dublin Core W3C Provenance Bio2RDF vocabulary 11 data item Bio2RDF dataset Source dataset void:inDataset prov:wasDerivedFrom
  • 12. Dataset Namespace # of triples Affymetrix affymetrix 44469611 Biomodels* biomodels 589753 Comparative Toxicogenomics Database ctd 141845167 DrugBank drugbank 1121468 NCBI Gene ncbigene 394026267 Gene Ontology Annotations goa 80028873 HUGO Gene Nomenclature Committee hgnc 836060 Homologene homologene 1281881 InterPro* interpro 999031 iProClass iproclass 211365460 iRefIndex irefindex 31042135 Medical Subject Headings mesh 4172230 NCBO BioPortal* bioportal 15384622 National Drug Code Directory* ndc 17814216 Online Mendelian Inheritance in Man omim 1848729 Pharmacogenomics Knowledge Base pharmgkb 37949275 SABIO-RK* sabiork 2618288 Saccharomyces Genome Database sgd 5551009 NCBI Taxonomy taxon 17814216 Total 19 1,010,758,291 Bio2RDF Release 2 – New and Updated Datasets ESWC2013::Bio2RDF Release 212
  • 15. The Wider Network of Bio2RDF Linked Data ESWC2013::Bio2RDF Release 215
  • 19. Graph summaries in query formulation ESWC2013::Bio2RDF Release 2 PREFIX drugbank_vocabulary: <http://bio2rdf.org/drugbank_vocabulary:> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT ?ddi ?d1name ?d2name WHERE { ?ddi a drugbank_vocabulary:Drug-Drug-Interaction . ?d1 drugbank_vocabulary:ddi-interactor-in ?ddi . ?d1 rdfs:label ?d1name . ?d2 drugbank_vocabulary:ddi-interactor-in ?ddi . ?d2 rdfs:label ?d2name. FILTER (?d1 != ?d2) } 19
  • 20. You can use the SPARQLed query assistant with updated endpoints ESWC2013::Bio2RDF Release 2 http://sindicetech.com/sindice-suite/sparqled/ graph: http://sindicetech.com/analytics20
  • 21. Use virtuoso’s built in faceted browser to construct increasingly complex queries with little effort ESWC2013::Bio2RDF Release 221
  • 22. Federated Queries over independent SPARQL endpoints # get all biochemical reactions in biomodels that are kinds of "protein catabolic process“, as defined by the gene ontology (in bioportal endpoint) SPARQL Endpoint: http://bioportal.bio2rdf.org/sparql SELECT ?go ?label count(distinct ?x) WHERE { ?go rdfs:label ?label . ?go rdfs:subClassOf ?tgo OPTION (TRANSITIVE) . ?tgo rdfs:label ?tlabel . FILTER regex(?tlabel, "^protein catabolic process") service <http://biomodels.bio2rdf.org/sparql> { ?x <http://bio2rdf.org/biopax_vocabulary:identical-to> ?go . ?x a <http://www.biopax.org/release/biopax-level3.owl#BiochemicalReaction> . } # end service } ESWC2013::Bio2RDF Release 222
  • 23. Question: Find all proteins that interact with beta amyloid SELECT * WHERE { ?protein a bio2rdf:Protein . ?protein bio2rdf:interacts_with bio2rdf:beta-amyloid . } Heterogeneous biological data on the semantic web is difficult to query UniProt Protein PDB Protein iRefIndex Protein ? Physical interaction? Pathway interaction? Genetic interaction? ESWC2013::Bio2RDF Release 223
  • 24. ontology as a strategy to formally represent and integrate knowledge ESWC2013::Bio2RDF Release 224
  • 25. uniprot:P05067 uniprot:Protein is a sio:gene is a is a Semantic data integration, consistency checking and query answering over Bio2RDF with the Semanticscience Integrated Ontology (SIO) dataset ontology Knowledge Base ESWC2013::Bio2RDF Release 2 pharmgkb:PA30917 refseq:Protein is a is a omim:189931 omim:Gene pharmgkb:Gene Querying Bio2RDF Linked Open Data with a Global Schema. Alison Callahan, José Cruz-Toledo and Michel Dumontier. Bio-ontologies 2012. 25
  • 26. ESWC2013::Bio2RDF Release 226 SRIQ(D) 10700+ axioms 1300+ classes 201 object properties (inc. inverses) 1 datatype property
  • 27. Bio2RDF and SIO powered SPARQL 1.1 federated query: Find chemicals in CTD and proteins in SGD that participate in the same GO process SELECT ?chem, ?prot, ?proc FROM <http://bio2rdf.org/ctd> WHERE { ?chemical a sio:chemical-entity. ?chemical rdfs:label ?chem. ?chemical sio:is-participant-in ?process. ?process rdfs:label ?proc. FILTER regex (?process, "http://bio2rdf.org/go:") SERVICE <http://sgd.bio2rdf.org/sparql> { ?protein a sio:protein . ?protein sio:is-participant-in ?process. ?protein rdfs:label ?prot . } } ESWC2013::Bio2RDF Release 227
  • 28. Bio2RDF RDFization guidelines are available at our wiki https://github.com/bio2rdf/bio2rdf-scripts/wiki/RDFization-Guide-v1.1 ESWC2013::Bio2RDF Release 228
  • 29. What the future holds • Aiming for twice yearly release schedule. Next release will include large datasets (>15B RefSeq, Genbank, PubMed, PDB) • Working with identifiers.org to create a common registry with 2200 entries • Spiking in identifiers.org and original data uris, where available • Consolidating provenance/metrics with OpenPHACTS • Incorporate W3C Linking Open Drug Data (LODD) effort – OMIM (released), SIDER (beta), CHEMBL (beta), LinkedCT (beta), DailyMed (RDF available), TCM (RDF available), • Extended dataset coverage by tapping into existing endpoints (uniprot, bioportal, ebi-rdf?) • Showcase with other third party tools ESWC2013::Bio2RDF Release 229
  • 30. Bio2RDF Release 2 – A summary • Updated data conversion source code to use PHP API (all available through GitHub) – http://github.com/bio2rdf/bio2rdf-scripts – https://github.com/bio2rdf/bio2rdf-scripts/wiki/RDFization- Guide-v1.1 • Simple Bio2RDF IRI design patterns that facilitate syntactic consistency and interoperability backed by simple registry • Dataset provenance and metrics • We welcome comments, suggestions and contributions • Join our mailing list at bio2rdf@googlegroups.com ESWC2013::Bio2RDF Release 230
  • 31. Acknowledgements Bio2RDF Release 2 Allison Callahan, Jose Cruz-Toledo, Peter Ansell Bio2RDF Francois Belleau, Marc-Alexandre Nolin Alex De Leon, Steve Etlinger, Nichealla Keath Jacques Corbeil, James Hogan Jean Morissette, Nicole Tourigny, Philippe Rigault and Paul Roe ESWC2013::Bio2RDF Release 231
  • 32. dumontierlab.com michel_dumontier@carleton.ca ESWC2013::Bio2RDF Release 2 Website: http://dumontierlab.com Presentations: http://slideshare.com/micheldumontier 32

Editor's Notes

  1. Bio2RDF facilitates data sharing and reuse in a number of ways: Access the scripts that generate our linked data. These scripts are openly licensed for modification, redistribution or use by anyone wishing to generate RDF data on their own We make use of syntactically consistent IRI patterns We do not impose a structural scheme for representing our data Bio2RDF allows for multiple mirrors host our data, a DNS “round robin” procedure balances incoming requests between participating mirrors, this prevents a single point of failure and allows for additional mirrors to be added to the network A centralized resource registry has been developed for consistent usage of namespaces and vocabularies- In the next slides I will go into a bit more detail about the first 3 points
  2. The Bio2RDF project transforms silos of life science data into a globally distributed network of linked data for biological knowledge discovery.
  3. In order to keep a clear link back to the original data, in our RDFized datasets we maintain the original data provider’s record identifiers by making use of the following URI pattern:- namespace: preferred short name for a biological dataset
  4. All resources in every bio2RDF graph is linked to a graph that describes the provenance of the generated linked data. We make use of the W3C Vocabulary of Interlinked Datasets (VoID), the Provenance vocabulary and Dublin core vocabulary to describe items such as: URL to the source data Date of generation Licensing information (if available) URL to the script used to generate the data Download URL for linked data files URL of SPARQL endpoint
  5. Here I present some numbers for the 19 updated datasets
  6. Here I am showing with more detail the top 10 subject type – predicate - object type frequencies that can be found in our Drugbank endpoint So for example to find drugs that participate in drug-drug interactions one would use the ddi-interactor-in predicate that associates the 1074 resources of type drug to the 10891 resources typed as drug-drug-interaction.
  7. We are currently in the process of exposing Bio2RDF data using a variety of 3rd party tools- specifically, all of our endpoints can be navigated using Virtuoso’s faceted browser- We have also generated Sindice metrics that enable the use of SPARQLedautomed query builder. (This tool allows users to semiautomatically construct SPARQL queries based on namespaces used therein)-- finally it is possible to use Sig.ma browser to aggreate data from multiple endpoints and construct mashups
  8. However, types and predicates are dataset specific and therefore not consistentWouldn’t it be nice to be able to query all of these datasets with a common nomenclature? (our objective)