SlideShare a Scribd company logo
1 of 63
Semantic Approaches for Biomedical Knowledge Discovery
1
Michel Dumontier, Ph.D.
Associate Professor of Medicine (Biomedical Informatics)
Stanford University
@micheldumontier::DS:10-10-2014
2 @micheldumontier::DS:10-10-2014HTTP://XKCD.COM/242/
Science
@micheldumontier::DS:10-10-20143
The unbelievable growth of scientific knowledge
4 @micheldumontier::DS:10-10-2014
Thousands of databases curate the literature into consumable facts
(problems: access, format, identifiers & linking)
5 @micheldumontier::DS:10-10-2014
Software is needed to analyze, predict and evaluate
(problems: OS, versioning, input/output formats)
6 @micheldumontier::DS:10-10-2014
Ultimately, we develop fairly sophisticated
programs/workflows to test our hypotheses
7 @micheldumontier::DS:10-10-2014
Wouldn’t it be great if we could just find the evidence
required to support or dispute a scientific hypothesis using
the most up-to-date and relevant data, tools and scientific
knowledge?
8 @micheldumontier::DS:10-10-2014
So what do we need to achieve this?
1. Standards to construct and interrogate a
massive, decentralized network of
interconnected data and software
2. Methods and Tools
– To prepare, interlink, and query data
– To mine and discover associations
– To identify novel, supported associations
3. Incentives and penalties
– Funding agencies, journals, institutions, societies,
conferences, workshops
@micheldumontier::DS:10-10-20149
The Semantic Web
is the new global web of knowledge
10 @micheldumontier::DS:10-10-2014
standards for publishing, sharing and querying
facts, expert knowledge and services
scalable approach for the discovery
of independently formulated
and distributed knowledge
we’re building a massive network of linked data
11 @micheldumontier::DS:10-10-2014Linking Open Data cloud diagram 2014, by Max Schmachtenberg, Christian Bizer, Anja Jentzsch and Richard Cyganiak. http://lod-cloud.net/"
@micheldumontier::DS:10-10-2014
Linked Data for the Life Sciences
12
Bio2RDF is an open source project to unify the
representation and interlinking of biological data using RDF.
chemicals/drugs/formulations,
genomes/genes/proteins, domains
Interactions, complexes & pathways
animal models and phenotypes
Disease, genetic markers, treatments
Terminologies & publications
• Release 3 (June 2014): 11B+ interlinked
statements from 35 biomedical datasets
• dataset description, provenance & statistics
• Partnerships with EBI, NCBI, DBCLS, NCBO,
OpenPHACTS, and commercial tool providers
Resource Description Framework
• It’s a language to represent knowledge
– Logic-based formalism -> automated reasoning
– graph-like properties -> data analysis
• Good for
– Describing in terms of type, attributes, relations
– Integrating data from different sources
– Sharing the data (W3C standard)
– Reusing what is available, developing what you need,
and contributing back to the web of data.
@micheldumontier::DS:10-10-201413
@micheldumontier::DS:10-10-2014
drugbank:DB00586
drugbank_vocabulary:Drug
rdf:type
drugbank_target:290
drugbank_vocabulary:Target
rdf:type
drugbank_vocabulary:targets
rdfs:label
Prostaglandin G/H synthase 2
[drugbank_target:290]
rdfs:label
Diclofenac [drugbank:DB00586]
14
The linked data network expands
with every reference
@micheldumontier::DS:10-10-2014
drugbank:DB00586
pharmgkb_vocabulary:Drug
rdf:type
rdfs:label
diclofenac [drugbank:DB00586]
pharmgkb:PA449293
drugbank_vocabulary:Drug
pharmgkb_vocabulary:xref
diclofenac [pharmgkb:PA449293]
rdfs:label
DrugBank
PharmGKB
15
@micheldumontier::DS:10-10-201416
@micheldumontier::DS:10-10-201417
@micheldumontier::DS:10-10-201418
Bio2RDF offers a highly connected
network of data
@micheldumontier::DS:10-10-201419
Graph summarization for query formulation
@micheldumontier::DS:10-10-2014
PREFIX drugbank_vocabulary: <http://bio2rdf.org/drugbank_vocabulary:>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?ddi ?d1name ?d2name
WHERE {
?ddi a drugbank_vocabulary:Drug-Drug-Interaction .
?d1 drugbank_vocabulary:ddi-interactor-in ?ddi .
?d1 rdfs:label ?d1name .
?d2 drugbank_vocabulary:ddi-interactor-in ?ddi .
?d2 rdfs:label ?d2name.
FILTER (?d1 != ?d2)
}
20
You can use query assistants
@micheldumontier::DS:10-10-2014
http://sindicetech.com/sindice-suite/sparqled/
graph: http://sindicetech.com/analytics21
Federated Queries over Independent
SPARQL EndPoints
Get all protein catabolic processes (and more specific) in biomodels
SELECT ?go ?label count(distinct ?x)
WHERE {
service <http://bioportal.bio2rdf.org/sparql> {
?go rdfs:label ?label .
?go rdfs:subClassOf ?tgo
?tgo rdfs:label ?tlabel .
FILTER regex(?tlabel, "^protein catabolic process")
}
service <http://biomodels.bio2rdf.org/sparql> {
?x <http://bio2rdf.org/biopax_vocabulary:identical-to> ?go .
?x a <http://www.biopax.org/release/biopax-level3.owl#BiochemicalReaction> .
}
}
@micheldumontier::DS:10-10-201422
23 @micheldumontier::DS:10-10-2014
Bio2RDF: 2M+ SPARQL queries per month
Despite all the data, it’s still hard to find answers to questions
Because there are many ways to represent the same data
and each dataset represents it differently
@micheldumontier::DS:10-10-201424
multiple formalizations of the same kind of
data do emerge, each with their own merit
@micheldumontier::DS:10-10-201425
Massive Proliferation of Ontologies / Vocabularies
@micheldumontier::DS:10-10-201426
Multi-Stakeholder Efforts to Standardize
Representations are Reasonable,
Long Term Strategies for Data Integration
@micheldumontier::DS:10-10-201427
uniprot:P05067
uniprot:Protein
is a
sio:gene
is a is a
Semantic data integration, consistency checking and
query answering over Bio2RDF with the
Semanticscience Integrated Ontology (SIO)
dataset
ontology
Knowledge Base
@micheldumontier::DS:10-10-2014
pharmgkb:PA30917
refseq:Protein
is a
is a
omim:189931
omim:Gene pharmgkb:Gene
Querying Bio2RDF Linked Open Data with a Global Schema. Alison Callahan, José Cruz-Toledo and
Michel Dumontier. Bio-ontologies 2012.
28
@micheldumontier::DS:10-10-201429
SRIQ(D)
10700+ axioms
1300+ classes
201 object properties (inc. inverses)
1 datatype property
Bio2RDF and SIO powered SPARQL 1.1 federated query:
Find chemicals (from CTD) and proteins (from SGD) that
participate in the same process (from GOA)
SELECT ?chem, ?prot, ?proc
FROM <http://bio2rdf.org/ctd>
WHERE {
SERVICE <http://ctd.bio2rdf.org/sparql> {
?chemical a sio:chemical-entity.
?chemical rdfs:label ?chem.
?chemical sio:is-participant-in ?process.
?process rdfs:label ?proc.
FILTER regex (?process, "http://bio2rdf.org/go:")
}
SERVICE <http://sgd.bio2rdf.org/sparql> {
?protein a sio:protein .
?protein sio:is-participant-in ?process.
?protein rdfs:label ?prot .
}
} @micheldumontier::DS:10-10-201430
tactical formalization
@micheldumontier::DS:10-10-201431
Take what you need
and represent it in a way that directly serves your objective
STANDARDSUSER DRIVEN REPRESENTATION
identifying aberrant and pharmacological pathways
predicting drug targets using organism phenotypes
Biopax-pathway exploration
FALDO-powered genome navigation
aberrant and pharmacological pathways
Q1. Can we identify pathways that are associated
with a particular disease or class of diseases?
Q2. Can we identify pathways are associated with
a particular drug or class of drugs?
drug
pathway
disease
@micheldumontier::DS:10-10-201432
gene
Identification of
drug and disease enriched pathways
• Approach
– Integrate 3 datasets
• DrugBank, PharmGKB and CTD
– Integrate 7 terminologies
• MeSH, ATC, ChEBI, UMLS, SNOMED, ICD, DO
– Formalize data of interest
– Identify significant associations using enrichment
analysis over the fully inferred knowledge base
Identifying aberrant pathways through integrated analysis of knowledge in pharmacogenomics.
Bioinformatics. 2012. @micheldumontier::DS:10-10-201433
Formal knowledge
representation
as a strategy for
data integration
@micheldumontier::DS:10-10-201434
Have you heard of OWL?
@micheldumontier::DS:10-10-201435
mercaptopurine
[pharmgkb:PA450379]
mercaptopurine
[drugbank:DB01033]
purine-6-thiol
[CHEBI:2208]
Class Equivalence
mercaptopurine
[ATC:L01BB02]
Top Level Classes
(disjointness)
drug diseasegenepathway
property chains
Class subsumption
mercaptopurine
[mesh:D015122]
Reciprocal
Existentials
drug disease
pathway gene
Formalized as an OWL-EL ontology
650,000+ classes, 3.2M subClassOf axioms, 75,000
equivalentClass axioms
@micheldumontier::DS:10-10-201436
Benefits: Enhanced Query Capability
– Use any mapped terminology to query a target resource.
– Use knowledge in target ontologies to formulate more
precise questions
• ask for drugs that are associated with diseases of the joint:
‘Chikungunya’ (do:0050012) is defined as a viral infectious disease
located in the ‘joint’ (fma:7490) and caused by a ‘Chikungunya
virus’ (taxon:37124).
– Learn relationships that are inferred by automated
reasoning.
• alcohol (ChEBI:30879) is associated with alcoholism (PA443309)
since alcoholism is directly associated with ethanol (CHEBI:16236)
• ‘parasitic infectious disease’ (do:0001398) associated with 129
drugs, 15 more than are directly linked.
@micheldumontier::DS:10-10-201437
Knowledge Discovery through Data
Integration and Enrichment Analysis
• OntoFunc: Tool to discover significant associations between sets of objects
and ontology categories. enrichment of attribute among a selected set of
input items as compared to a reference set. hypergeometric or the
binomial distribution, Fisher's exact test, or a chi-square test.
• We found 22,653 disease-pathway associations, where for each pathway
we find genes that are linked to disease.
– Mood disorder (do:3324) associated with Zidovudine Pathway
(pharmgkb:PA165859361). Zidovudine is for treating HIV/AIDS. Side
effects include fatigue, headache, myalgia, malaise and anorexia
• We found 13,826 pathway-chemical associations
– Clopidogrel (chebi:37941) associated with Endothelin signaling
pathway (pharmgkb:PA164728163). Endothelins are proteins that
constrict blood vessels and raise blood pressure. Clopidogrel inhibits
platelet aggregation and prolongs bleeding time.
@micheldumontier::DS:10-10-201438
Tactical Formalization + Automated Reasoning
Offers Compelling Value for Certain Problems
We need to be smart about the goal, and how best to
achieve it. Tactical formalization is another tool in the
toolbox.
We’ve formalized data as OWL ontologies to verify, fix and
exploit Linked Data through expressive OWL reasoning
• To identify mistakes in human curated knowledge
• To identify conflicting meaning in terms
• To identify mistakes in the representation of RDF data
o incorrect use of relations
o incorrect assertion of identity (owl:sameAs)
Many other applications can be envisioned.
@micheldumontier::DS:10-10-201439
PhenomeDrug
A computational approach to predict drug
targets, drug effects, and drug indications using
phenotypes
@micheldumontier::DS:10-10-201440
Mouse model phenotypes provide information about human drug targets.
Hoehndorf R, Hiebert T, Hardy NW, Schofield PN, Gkoutos GV, Dumontier M.
Bioinformatics. 2013.
animal models provide insight for on target effects
• In the majority of 100 best selling drugs ($148B in
US alone), there is a direct correlation between
knockout phenotype and drug effect
• Immunological Indications
– Anti-histamines (Claritin, Allegra, Zyrtec)
– KO of histamine H1 receptor leads to decreased
responsiveness of immune system
– Predicts on target effects : drowsiness, reduced
anxiety
@micheldumontier::DS:10-10-201441
Zambrowicz and Sands. Nat Rev Drug Disc. 2003.
Identifying drug targets
from mouse knock-out phenotypes
@micheldumontier::DS:10-10-201442
drug
gene
phenotypes effects
human gene
non-functional
gene model
ortholog
similar
inhibits
Main idea: if a drug’s phenotypes matches the phenotypes of a
null model, this suggests that the drug is an inhibitor of the gene
Terminological Interoperability
(we must compare apples with apples)
Mouse
Phenotypes
Drug effects
(mappings from UMLS to DO, NBO, MP)
Mammalian
Phenotype
OntologyPhenomeNet
PhenomeDrug
@micheldumontier::DS:10-10-2014
poor
coordination
decreased gut
peristalsis
axon
degeneration
decreased
stride length
erotypic
ehavior
Abnormal
EEG
failure to find
food
Unstable
posture
Constipation
Neuronal loss in
Substantia Nigra
Shuffling gait
Resting tremors
REM disorder
Hyposmia
poor rotarod
performance
decreased gut
peristalsis
axon
degeneration
decreased
stride length
sterotypic
behavior
abnormal
EEG
failure to find
food
abnormal
coordination
abnormal
digestive
physiology
CNS neuron
degeneration
abnormal
locomotion
abnormal
motor function
sleep
disturbance
abnormal
olfaction
Semantic Similarity
@micheldumontier::DS:10-10-201445
Given a drug effect profile D and a mouse model M, we
compute the semantic similarity as an information weighted
Jaccard metric.
The similarity measure used is non-symmetrical and
determines the amount of information about a drug effect
profile D that is covered by a set of mouse model
phenotypes M.
Loss of function models predict
targets of inhibitor drugs
• 14,682 drugs; 7,255 mouse genotypes
• Validation against known and predicted inhibitor-target pairs
– 0.76 ROC AUC for human targets (DrugBank)
– 0.81 ROC AUC for mouse targets (STITCH)
• diclofenac (STITCH:000003032)
– NSAID used to treat pain, osteoarthritis and rheumatoid arthritis
– Drug effects include liver inflammation (hepatitis), swelling of liver
(hepatomegaly), redness of skin (erythema)
– 49% explained by PPARg knockout
• peroxisome proliferator activated receptor gamma (PPARg) regulates metabolism,
proliferation, inflammation and differentiation,
• Diclofenac is a known inhibitor
– 46% explained by COX-2 knockout
• Diclofenac is a known inhibitor
@micheldumontier::DS:10-10-2014
Phenotype-Based Drug Repurposing
@micheldumontier::DS:10-10-201447
Using the Semantic Web to Gather
Evidence for Scientific Hypotheses
What evidence supports or disputes that TKIs are cardiotoxic?
@micheldumontier::DS:10-10-201448
• Tyrosine Kinase Inhibitors (TKI)
– Imatinib, Sorafenib, Sunitinib, Dasatinib, Nilotinib, Lapatinib
– Used to treat cancer
– Linked to cardiotoxicity.
• FDA launched drug safety program to detect toxicity
– Need to integrate data and ontologies (Abernethy, CPT 2011)
– Abernethy (2013) suggest using public data in genetics,
pharmacology, toxicology, systems biology, to
predict/validate adverse events
• What evidence could we gather to give credence that
TKI’s causes non-QT cardiotoxicity?
@micheldumontier::DS:10-10-201449
FDA Use Case:
TKI non-QT Cardiotoxicity
@micheldumontier::DS:10-10-201450
Jane P.F. Bai and Darrell R. Abernethy. Systems Pharmacology to Predict Drug Toxicity: Integration Across Levels of
Biological Organization. Annu. Rev. Pharmacol. Toxicol. 2013.53:451-473
• The goal of HyQue is retrieve and evaluate evidence
that supports/disputes a hypothesis
– hypotheses are described as a set of events
• e.g. binding, inhibition, phenotypic effect
– events are associated with types of evidence
• a query is written to retrieve data
• a weight is assigned to provide significance
• Hypotheses are written by people who seek answers
• data retrieval rules are written by people who know the
data and how it should be interpreted 
@micheldumontier::DS:10-10-201451
HyQue
1. HyQue: Evaluating hypotheses using Semantic Web technologies. J Biomed Semantics. 2011 May 17;2 Suppl 2:S3.
2. Evaluating scientific hypotheses using the SPARQL Inferencing Notation. Extended Semantic Web Conference (ESWC
2012). Heraklion, Crete. May 27-31, 2012.
HyQue: A Semantic Web Application
@micheldumontier::DS:10-10-
52
Software
OntologiesData
Hypothesis Evaluation
What evidence might we gather?
• clinical: Are there cardiotoxic effects associated with the drug?
– Literature (studies) [curated db]
– Product labels (studies) [r3:sider]
– Clinical trials (studies) [r3:clinicaltrials]
– Adverse event reports [r2:pharmgkb/onesides]
– Electronic health records (observations)
• pre-clinical associations:
– genotype-phenotype (null/disease models) [r2:mgi, r2:sgd; r3:wormbase]
– in vitro assays (IC50) [r3:chembl]
– drug targets [r2:drugbank; r2:ctd; r3:stitch]
– drug-gene expression [r3:gxa]
– pathways [r2:kegg; r3:reactome]
– Drug-pathway, disease-pathway enrichments [aberrant pathways]
– Chemical properties [r2:pubchem; r2.drugbank]
– Toxicology [r1.toxkb/cebs]
@micheldumontier::DS:10-10-201453
Data retrieval is done with SPARQL
@micheldumontier::DS:10-10-201454
Data Evaluation is done with SPIN
rules
@micheldumontier::DS:10-10-201455
@micheldumontier::DS:10-10-201456
@micheldumontier::DS:10-10-201457
http://bio2rdf.org/drugbank:DB01268
@micheldumontier::DS:10-10-201458
@micheldumontier::DS:10-10-201459
@micheldumontier::DS:10-10-201460
In Summary
• This talk was about making sense and using
the structured data we already have
• RDF-based Linked Open Data acts as a
substrate for query answering and task-based
formalization in OWL
• Discovery through the generation of testable
hypotheses in the target domain.
• Using Linked Data to evaluate scientific
hypotheses
@micheldumontier::DS:10-10-201461
Looking to the Future
• Community guidelines for RDF-based data and
dataset descriptions (e.g. CEDAR)
• Alignment and consolidation of OWL ontologies
(e.g. UMLS)
• Identifying and filling gaps in our knowledge (e.g.
Adam the Robot scientist)
• Improving our coverage of available evidence
(e.g. HyQue)
• More sophisticated data mining (e.g. you!)
@micheldumontier::DS:10-10-201462
Acknowledgements
Bio2RDF Release 2:
Allison Callahan, Jose Cruz-Toledo, Peter Ansell
Aberrant Pathways: Robert Hoehndorf, Georgios Gkoutos
PhenomeDrug: Tanya Hiebert, Robert Hoehndorf, Georgios
Gkoutos, Paul Schofield
TKI Cardiotoxicity: Alison Callahan, Tania Hiebert, Beatriz
Lujan, Sira Sarntivijai (FDA)
@micheldumontier::DS:10-10-201463
dumontierlab.com
michel.dumontier@stanford.edu
Website: http://dumontierlab.com
Presentations: http://slideshare.com/micheldumontier
64 @micheldumontier::DS:10-10-2014

More Related Content

What's hot

Building a Network of Interoperable and Independently Produced Linked and Ope...
Building a Network of Interoperable and Independently Produced Linked and Ope...Building a Network of Interoperable and Independently Produced Linked and Ope...
Building a Network of Interoperable and Independently Produced Linked and Ope...Michel Dumontier
 
Nucl. Acids Res.-2014-Howe-nar-gku1244
Nucl. Acids Res.-2014-Howe-nar-gku1244Nucl. Acids Res.-2014-Howe-nar-gku1244
Nucl. Acids Res.-2014-Howe-nar-gku1244Yasel Cruz
 
How many medline platforms on the web?
How many medline platforms on the web?How many medline platforms on the web?
How many medline platforms on the web?Basset Hervé
 
Exploring Chemical and Biological Knowledge Spaces with PubChem
Exploring Chemical and Biological Knowledge Spaces with PubChemExploring Chemical and Biological Knowledge Spaces with PubChem
Exploring Chemical and Biological Knowledge Spaces with PubChemPaul Thiessen
 
Data for AI models, the past, the present, the future
Data for AI models, the past, the present, the futureData for AI models, the past, the present, the future
Data for AI models, the past, the present, the futurePistoia Alliance
 
CEDAR work bench for metadata management
CEDAR work bench for metadata managementCEDAR work bench for metadata management
CEDAR work bench for metadata managementPistoia Alliance
 
Leveraging publication metadata to help overcome the data ingest bottleneck
Leveraging publication metadata to help overcome the data ingest bottleneck Leveraging publication metadata to help overcome the data ingest bottleneck
Leveraging publication metadata to help overcome the data ingest bottleneck Todd Vision
 
dkNET Webinar: "The Microphysiology Systems Database (MPS-Db): A Platform For...
dkNET Webinar: "The Microphysiology Systems Database (MPS-Db): A Platform For...dkNET Webinar: "The Microphysiology Systems Database (MPS-Db): A Platform For...
dkNET Webinar: "The Microphysiology Systems Database (MPS-Db): A Platform For...dkNET
 
Pistoia Alliance-Elsevier Datathon
Pistoia Alliance-Elsevier DatathonPistoia Alliance-Elsevier Datathon
Pistoia Alliance-Elsevier DatathonPistoia Alliance
 
Citing data in research articles: principles, implementation, challenges - an...
Citing data in research articles: principles, implementation, challenges - an...Citing data in research articles: principles, implementation, challenges - an...
Citing data in research articles: principles, implementation, challenges - an...FAIRDOM
 
Data sharing - Data management - The SysMO-SEEK Story
Data sharing - Data management - The SysMO-SEEK StoryData sharing - Data management - The SysMO-SEEK Story
Data sharing - Data management - The SysMO-SEEK StoryResearch Information Network
 
Data reuse and scholarly reward: understanding practice and building infrastr...
Data reuse and scholarly reward: understanding practice and building infrastr...Data reuse and scholarly reward: understanding practice and building infrastr...
Data reuse and scholarly reward: understanding practice and building infrastr...Todd Vision
 

What's hot (20)

Building a Network of Interoperable and Independently Produced Linked and Ope...
Building a Network of Interoperable and Independently Produced Linked and Ope...Building a Network of Interoperable and Independently Produced Linked and Ope...
Building a Network of Interoperable and Independently Produced Linked and Ope...
 
Data Science for the Win
Data Science for the WinData Science for the Win
Data Science for the Win
 
MLA CE Course: Third-Party PubMed Tools
MLA CE Course: Third-Party PubMed ToolsMLA CE Course: Third-Party PubMed Tools
MLA CE Course: Third-Party PubMed Tools
 
Third-Party PubMed Tools
Third-Party PubMed ToolsThird-Party PubMed Tools
Third-Party PubMed Tools
 
Towards a gold standard and regarding quality in public domain chemistry data...
Towards a gold standard and regarding quality in public domain chemistry data...Towards a gold standard and regarding quality in public domain chemistry data...
Towards a gold standard and regarding quality in public domain chemistry data...
 
Nucl. Acids Res.-2014-Howe-nar-gku1244
Nucl. Acids Res.-2014-Howe-nar-gku1244Nucl. Acids Res.-2014-Howe-nar-gku1244
Nucl. Acids Res.-2014-Howe-nar-gku1244
 
How many medline platforms on the web?
How many medline platforms on the web?How many medline platforms on the web?
How many medline platforms on the web?
 
Canadian health census to lod
Canadian health census to lodCanadian health census to lod
Canadian health census to lod
 
Exploring Chemical and Biological Knowledge Spaces with PubChem
Exploring Chemical and Biological Knowledge Spaces with PubChemExploring Chemical and Biological Knowledge Spaces with PubChem
Exploring Chemical and Biological Knowledge Spaces with PubChem
 
Data for AI models, the past, the present, the future
Data for AI models, the past, the present, the futureData for AI models, the past, the present, the future
Data for AI models, the past, the present, the future
 
CEDAR work bench for metadata management
CEDAR work bench for metadata managementCEDAR work bench for metadata management
CEDAR work bench for metadata management
 
Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
 
Leveraging publication metadata to help overcome the data ingest bottleneck
Leveraging publication metadata to help overcome the data ingest bottleneck Leveraging publication metadata to help overcome the data ingest bottleneck
Leveraging publication metadata to help overcome the data ingest bottleneck
 
dkNET Webinar: "The Microphysiology Systems Database (MPS-Db): A Platform For...
dkNET Webinar: "The Microphysiology Systems Database (MPS-Db): A Platform For...dkNET Webinar: "The Microphysiology Systems Database (MPS-Db): A Platform For...
dkNET Webinar: "The Microphysiology Systems Database (MPS-Db): A Platform For...
 
Pistoia Alliance-Elsevier Datathon
Pistoia Alliance-Elsevier DatathonPistoia Alliance-Elsevier Datathon
Pistoia Alliance-Elsevier Datathon
 
Citing data in research articles: principles, implementation, challenges - an...
Citing data in research articles: principles, implementation, challenges - an...Citing data in research articles: principles, implementation, challenges - an...
Citing data in research articles: principles, implementation, challenges - an...
 
Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...
Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...
Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...
 
Data sharing - Data management - The SysMO-SEEK Story
Data sharing - Data management - The SysMO-SEEK StoryData sharing - Data management - The SysMO-SEEK Story
Data sharing - Data management - The SysMO-SEEK Story
 
Data reuse and scholarly reward: understanding practice and building infrastr...
Data reuse and scholarly reward: understanding practice and building infrastr...Data reuse and scholarly reward: understanding practice and building infrastr...
Data reuse and scholarly reward: understanding practice and building infrastr...
 
The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata ...
The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata ...The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata ...
The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata ...
 

Viewers also liked

Sorafenib
SorafenibSorafenib
Sorafenib3s4num
 
Miotics mydriatics cycloplegics
Miotics mydriatics cycloplegicsMiotics mydriatics cycloplegics
Miotics mydriatics cycloplegicsChakri Psb
 
Animal Handling and Restraint
 Animal Handling and Restraint Animal Handling and Restraint
Animal Handling and Restraintjonalyn shenton
 
Miotics and mydriatics
Miotics and mydriaticsMiotics and mydriatics
Miotics and mydriaticsSSSIHMS-PG
 
Representing chemicals using OWL, Description Graphs and Rules
Representing chemicals using OWL, Description Graphs and RulesRepresenting chemicals using OWL, Description Graphs and Rules
Representing chemicals using OWL, Description Graphs and RulesMichel Dumontier
 
Podcasts And Screencasts
Podcasts And ScreencastsPodcasts And Screencasts
Podcasts And Screencastspfriedman729
 
HCS as a tool for Systems Biology
HCS as a tool for Systems BiologyHCS as a tool for Systems Biology
HCS as a tool for Systems Biologyjamesgevans
 
Real World Applications of OWL
Real World Applications of OWLReal World Applications of OWL
Real World Applications of OWLMichel Dumontier
 
Web Configurator
Web ConfiguratorWeb Configurator
Web Configuratormikuzz
 
Spiceworksintro 1219952728712087 9
Spiceworksintro 1219952728712087 9Spiceworksintro 1219952728712087 9
Spiceworksintro 1219952728712087 9fredjr
 
Tom Nastas 2011 2 page summary, english
Tom Nastas 2011 2 page summary, englishTom Nastas 2011 2 page summary, english
Tom Nastas 2011 2 page summary, englishThomas Nastas
 
Path to Commercialization, Nastas Presentation to Winner of Grant Competition
Path to Commercialization, Nastas Presentation to Winner of Grant CompetitionPath to Commercialization, Nastas Presentation to Winner of Grant Competition
Path to Commercialization, Nastas Presentation to Winner of Grant CompetitionThomas Nastas
 
OWLED2009: A platform for distributing and reasoning with OWL-EL knowledge ba...
OWLED2009: A platform for distributing and reasoning with OWL-EL knowledge ba...OWLED2009: A platform for distributing and reasoning with OWL-EL knowledge ba...
OWLED2009: A platform for distributing and reasoning with OWL-EL knowledge ba...Michel Dumontier
 

Viewers also liked (20)

Sorafenib
SorafenibSorafenib
Sorafenib
 
Sorafenib
SorafenibSorafenib
Sorafenib
 
Mydriatics
MydriaticsMydriatics
Mydriatics
 
Invivo pharmacology
Invivo pharmacologyInvivo pharmacology
Invivo pharmacology
 
Miotics mydriatics cycloplegics
Miotics mydriatics cycloplegicsMiotics mydriatics cycloplegics
Miotics mydriatics cycloplegics
 
Ppt 16 1-2014 kpsg pdf (1) (2)
Ppt 16 1-2014 kpsg pdf (1) (2)Ppt 16 1-2014 kpsg pdf (1) (2)
Ppt 16 1-2014 kpsg pdf (1) (2)
 
Animal Handling and Restraint
 Animal Handling and Restraint Animal Handling and Restraint
Animal Handling and Restraint
 
Miotics and mydriatics
Miotics and mydriaticsMiotics and mydriatics
Miotics and mydriatics
 
Representing chemicals using OWL, Description Graphs and Rules
Representing chemicals using OWL, Description Graphs and RulesRepresenting chemicals using OWL, Description Graphs and Rules
Representing chemicals using OWL, Description Graphs and Rules
 
Podcasts And Screencasts
Podcasts And ScreencastsPodcasts And Screencasts
Podcasts And Screencasts
 
HCS as a tool for Systems Biology
HCS as a tool for Systems BiologyHCS as a tool for Systems Biology
HCS as a tool for Systems Biology
 
Real World Applications of OWL
Real World Applications of OWLReal World Applications of OWL
Real World Applications of OWL
 
Web Configurator
Web ConfiguratorWeb Configurator
Web Configurator
 
Spiceworksintro 1219952728712087 9
Spiceworksintro 1219952728712087 9Spiceworksintro 1219952728712087 9
Spiceworksintro 1219952728712087 9
 
Collaborate
CollaborateCollaborate
Collaborate
 
Tom Nastas 2011 2 page summary, english
Tom Nastas 2011 2 page summary, englishTom Nastas 2011 2 page summary, english
Tom Nastas 2011 2 page summary, english
 
Email etiquette
Email etiquetteEmail etiquette
Email etiquette
 
MSP-AzureDev101
MSP-AzureDev101MSP-AzureDev101
MSP-AzureDev101
 
Path to Commercialization, Nastas Presentation to Winner of Grant Competition
Path to Commercialization, Nastas Presentation to Winner of Grant CompetitionPath to Commercialization, Nastas Presentation to Winner of Grant Competition
Path to Commercialization, Nastas Presentation to Winner of Grant Competition
 
OWLED2009: A platform for distributing and reasoning with OWL-EL knowledge ba...
OWLED2009: A platform for distributing and reasoning with OWL-EL knowledge ba...OWLED2009: A platform for distributing and reasoning with OWL-EL knowledge ba...
OWLED2009: A platform for distributing and reasoning with OWL-EL knowledge ba...
 

Similar to Semantic approaches for biomedical knowledge discovery - Discovery Science 2014 Keynote

Advancing Biomedical Knowledge Reuse with FAIR
Advancing Biomedical Knowledge Reuse with FAIRAdvancing Biomedical Knowledge Reuse with FAIR
Advancing Biomedical Knowledge Reuse with FAIRMichel Dumontier
 
Developing and assessing FAIR digital resources
Developing and assessing FAIR digital resourcesDeveloping and assessing FAIR digital resources
Developing and assessing FAIR digital resourcesMichel Dumontier
 
Toward F.A.I.R. Pharma. PhUSE Linked Data Initiatives Past and Present
Toward F.A.I.R. Pharma. PhUSE Linked Data Initiatives Past and PresentToward F.A.I.R. Pharma. PhUSE Linked Data Initiatives Past and Present
Toward F.A.I.R. Pharma. PhUSE Linked Data Initiatives Past and PresentTim Williams
 
Open science and medical evidence generation - Kees van Bochove - The Hyve
Open science and medical evidence generation - Kees van Bochove - The HyveOpen science and medical evidence generation - Kees van Bochove - The Hyve
Open science and medical evidence generation - Kees van Bochove - The HyveKees van Bochove
 
Big Data in Biomedicine – An NIH Perspective
Big Data in Biomedicine – An NIH PerspectiveBig Data in Biomedicine – An NIH Perspective
Big Data in Biomedicine – An NIH PerspectivePhilip Bourne
 
Opening up pharmacological space, the OPEN PHACTs api
Opening up pharmacological space, the OPEN PHACTs apiOpening up pharmacological space, the OPEN PHACTs api
Opening up pharmacological space, the OPEN PHACTs apiChris Evelo
 
OpenTox - an open community and framework supporting predictive toxicology an...
OpenTox - an open community and framework supporting predictive toxicology an...OpenTox - an open community and framework supporting predictive toxicology an...
OpenTox - an open community and framework supporting predictive toxicology an...Barry Hardy
 
Introduction to Bioinformatics.
 Introduction to Bioinformatics. Introduction to Bioinformatics.
Introduction to Bioinformatics.Elena Sügis
 
CINECA webinar slides: Open science through fair health data networks dream o...
CINECA webinar slides: Open science through fair health data networks dream o...CINECA webinar slides: Open science through fair health data networks dream o...
CINECA webinar slides: Open science through fair health data networks dream o...CINECAProject
 
Semantics and linked data at astra zeneca
Semantics and linked data at astra zenecaSemantics and linked data at astra zeneca
Semantics and linked data at astra zenecaKerstin Forsberg
 
FAIRness Assessment of the Library of Integrated Network-based Cellular Signa...
FAIRness Assessment of the Library of Integrated Network-based Cellular Signa...FAIRness Assessment of the Library of Integrated Network-based Cellular Signa...
FAIRness Assessment of the Library of Integrated Network-based Cellular Signa...Kathleen Jagodnik
 
Open Insights Harvard DBMI - Personal Health Train - Kees van Bochove - The Hyve
Open Insights Harvard DBMI - Personal Health Train - Kees van Bochove - The HyveOpen Insights Harvard DBMI - Personal Health Train - Kees van Bochove - The Hyve
Open Insights Harvard DBMI - Personal Health Train - Kees van Bochove - The HyveKees van Bochove
 
Faster R & D Analysis Tool - TRG
Faster R & D Analysis Tool - TRG Faster R & D Analysis Tool - TRG
Faster R & D Analysis Tool - TRG TRG
 
Data Virtualization Modernizes Biobanking
Data Virtualization Modernizes BiobankingData Virtualization Modernizes Biobanking
Data Virtualization Modernizes BiobankingDenodo
 
P4 c2011 slides ekins
P4 c2011 slides ekinsP4 c2011 slides ekins
P4 c2011 slides ekinsSean Ekins
 
Expert Panel on Data Challenges in Translational Research
Expert Panel on Data Challenges in Translational ResearchExpert Panel on Data Challenges in Translational Research
Expert Panel on Data Challenges in Translational ResearchEagle Genomics
 
Using the Micropublications ontology and the Open Annotation Data Model to re...
Using the Micropublications ontology and the Open Annotation Data Model to re...Using the Micropublications ontology and the Open Annotation Data Model to re...
Using the Micropublications ontology and the Open Annotation Data Model to re...jodischneider
 
Data supporting precision oncology fda wakibbe
Data supporting precision oncology fda wakibbeData supporting precision oncology fda wakibbe
Data supporting precision oncology fda wakibbeWarren Kibbe
 

Similar to Semantic approaches for biomedical knowledge discovery - Discovery Science 2014 Keynote (20)

Advancing Biomedical Knowledge Reuse with FAIR
Advancing Biomedical Knowledge Reuse with FAIRAdvancing Biomedical Knowledge Reuse with FAIR
Advancing Biomedical Knowledge Reuse with FAIR
 
Developing and assessing FAIR digital resources
Developing and assessing FAIR digital resourcesDeveloping and assessing FAIR digital resources
Developing and assessing FAIR digital resources
 
Toward F.A.I.R. Pharma. PhUSE Linked Data Initiatives Past and Present
Toward F.A.I.R. Pharma. PhUSE Linked Data Initiatives Past and PresentToward F.A.I.R. Pharma. PhUSE Linked Data Initiatives Past and Present
Toward F.A.I.R. Pharma. PhUSE Linked Data Initiatives Past and Present
 
Open science and medical evidence generation - Kees van Bochove - The Hyve
Open science and medical evidence generation - Kees van Bochove - The HyveOpen science and medical evidence generation - Kees van Bochove - The Hyve
Open science and medical evidence generation - Kees van Bochove - The Hyve
 
Big Data in Biomedicine – An NIH Perspective
Big Data in Biomedicine – An NIH PerspectiveBig Data in Biomedicine – An NIH Perspective
Big Data in Biomedicine – An NIH Perspective
 
Opening up pharmacological space, the OPEN PHACTs api
Opening up pharmacological space, the OPEN PHACTs apiOpening up pharmacological space, the OPEN PHACTs api
Opening up pharmacological space, the OPEN PHACTs api
 
OpenTox - an open community and framework supporting predictive toxicology an...
OpenTox - an open community and framework supporting predictive toxicology an...OpenTox - an open community and framework supporting predictive toxicology an...
OpenTox - an open community and framework supporting predictive toxicology an...
 
Introduction to Bioinformatics.
 Introduction to Bioinformatics. Introduction to Bioinformatics.
Introduction to Bioinformatics.
 
CINECA webinar slides: Open science through fair health data networks dream o...
CINECA webinar slides: Open science through fair health data networks dream o...CINECA webinar slides: Open science through fair health data networks dream o...
CINECA webinar slides: Open science through fair health data networks dream o...
 
Semantics and linked data at astra zeneca
Semantics and linked data at astra zenecaSemantics and linked data at astra zeneca
Semantics and linked data at astra zeneca
 
FAIRness Assessment of the Library of Integrated Network-based Cellular Signa...
FAIRness Assessment of the Library of Integrated Network-based Cellular Signa...FAIRness Assessment of the Library of Integrated Network-based Cellular Signa...
FAIRness Assessment of the Library of Integrated Network-based Cellular Signa...
 
Open Insights Harvard DBMI - Personal Health Train - Kees van Bochove - The Hyve
Open Insights Harvard DBMI - Personal Health Train - Kees van Bochove - The HyveOpen Insights Harvard DBMI - Personal Health Train - Kees van Bochove - The Hyve
Open Insights Harvard DBMI - Personal Health Train - Kees van Bochove - The Hyve
 
Origins of FAIR webinar
Origins of FAIR webinarOrigins of FAIR webinar
Origins of FAIR webinar
 
Faster R & D Analysis Tool - TRG
Faster R & D Analysis Tool - TRG Faster R & D Analysis Tool - TRG
Faster R & D Analysis Tool - TRG
 
Data Virtualization Modernizes Biobanking
Data Virtualization Modernizes BiobankingData Virtualization Modernizes Biobanking
Data Virtualization Modernizes Biobanking
 
P4 c2011 slides ekins
P4 c2011 slides ekinsP4 c2011 slides ekins
P4 c2011 slides ekins
 
Expert Panel on Data Challenges in Translational Research
Expert Panel on Data Challenges in Translational ResearchExpert Panel on Data Challenges in Translational Research
Expert Panel on Data Challenges in Translational Research
 
Martone grethe
Martone gretheMartone grethe
Martone grethe
 
Using the Micropublications ontology and the Open Annotation Data Model to re...
Using the Micropublications ontology and the Open Annotation Data Model to re...Using the Micropublications ontology and the Open Annotation Data Model to re...
Using the Micropublications ontology and the Open Annotation Data Model to re...
 
Data supporting precision oncology fda wakibbe
Data supporting precision oncology fda wakibbeData supporting precision oncology fda wakibbe
Data supporting precision oncology fda wakibbe
 

More from Michel Dumontier

A metadata standard for Knowledge Graphs
A metadata standard for Knowledge GraphsA metadata standard for Knowledge Graphs
A metadata standard for Knowledge GraphsMichel Dumontier
 
Data-Driven Discovery Science with FAIR Knowledge Graphs
Data-Driven Discovery Science with FAIR Knowledge GraphsData-Driven Discovery Science with FAIR Knowledge Graphs
Data-Driven Discovery Science with FAIR Knowledge GraphsMichel Dumontier
 
The Role of the FAIR Guiding Principles for an effective Learning Health System
The Role of the FAIR Guiding Principles for an effective Learning Health SystemThe Role of the FAIR Guiding Principles for an effective Learning Health System
The Role of the FAIR Guiding Principles for an effective Learning Health SystemMichel Dumontier
 
CIKM2020 Keynote: Accelerating discovery science with an Internet of FAIR dat...
CIKM2020 Keynote: Accelerating discovery science with an Internet of FAIR dat...CIKM2020 Keynote: Accelerating discovery science with an Internet of FAIR dat...
CIKM2020 Keynote: Accelerating discovery science with an Internet of FAIR dat...Michel Dumontier
 
The role of the FAIR Guiding Principles in a Learning Health System
The role of the FAIR Guiding Principles in a Learning Health SystemThe role of the FAIR Guiding Principles in a Learning Health System
The role of the FAIR Guiding Principles in a Learning Health SystemMichel Dumontier
 
Acclerating biomedical discovery with an internet of FAIR data and services -...
Acclerating biomedical discovery with an internet of FAIR data and services -...Acclerating biomedical discovery with an internet of FAIR data and services -...
Acclerating biomedical discovery with an internet of FAIR data and services -...Michel Dumontier
 
Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...
Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...
Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...Michel Dumontier
 
Are we FAIR yet? And will it be worth it?
Are we FAIR yet? And will it be worth it?Are we FAIR yet? And will it be worth it?
Are we FAIR yet? And will it be worth it?Michel Dumontier
 
The Future of FAIR Data: An international social, legal and technological inf...
The Future of FAIR Data: An international social, legal and technological inf...The Future of FAIR Data: An international social, legal and technological inf...
The Future of FAIR Data: An international social, legal and technological inf...Michel Dumontier
 
Keynote at the 2018 Maastricht University Dinner
Keynote at the 2018 Maastricht University DinnerKeynote at the 2018 Maastricht University Dinner
Keynote at the 2018 Maastricht University DinnerMichel Dumontier
 
The future of science and business - a UM Star Lecture
The future of science and business - a UM Star LectureThe future of science and business - a UM Star Lecture
The future of science and business - a UM Star LectureMichel Dumontier
 
A Framework to develop the FAIR Metrics
A Framework to develop the FAIR MetricsA Framework to develop the FAIR Metrics
A Framework to develop the FAIR MetricsMichel Dumontier
 
FAIR principles and metrics for evaluation
FAIR principles and metrics for evaluationFAIR principles and metrics for evaluation
FAIR principles and metrics for evaluationMichel Dumontier
 
Towards metrics to assess and encourage FAIRness
Towards metrics to assess and encourage FAIRnessTowards metrics to assess and encourage FAIRness
Towards metrics to assess and encourage FAIRnessMichel Dumontier
 
Making the most of phenotypes in ontology-based biomedical knowledge discovery
Making the most of phenotypes in ontology-based biomedical knowledge discoveryMaking the most of phenotypes in ontology-based biomedical knowledge discovery
Making the most of phenotypes in ontology-based biomedical knowledge discoveryMichel Dumontier
 
1st Network-of-BioThings Hackathon
1st Network-of-BioThings Hackathon1st Network-of-BioThings Hackathon
1st Network-of-BioThings HackathonMichel Dumontier
 

More from Michel Dumontier (19)

A metadata standard for Knowledge Graphs
A metadata standard for Knowledge GraphsA metadata standard for Knowledge Graphs
A metadata standard for Knowledge Graphs
 
Data-Driven Discovery Science with FAIR Knowledge Graphs
Data-Driven Discovery Science with FAIR Knowledge GraphsData-Driven Discovery Science with FAIR Knowledge Graphs
Data-Driven Discovery Science with FAIR Knowledge Graphs
 
Evaluating FAIRness
Evaluating FAIRnessEvaluating FAIRness
Evaluating FAIRness
 
The Role of the FAIR Guiding Principles for an effective Learning Health System
The Role of the FAIR Guiding Principles for an effective Learning Health SystemThe Role of the FAIR Guiding Principles for an effective Learning Health System
The Role of the FAIR Guiding Principles for an effective Learning Health System
 
CIKM2020 Keynote: Accelerating discovery science with an Internet of FAIR dat...
CIKM2020 Keynote: Accelerating discovery science with an Internet of FAIR dat...CIKM2020 Keynote: Accelerating discovery science with an Internet of FAIR dat...
CIKM2020 Keynote: Accelerating discovery science with an Internet of FAIR dat...
 
The role of the FAIR Guiding Principles in a Learning Health System
The role of the FAIR Guiding Principles in a Learning Health SystemThe role of the FAIR Guiding Principles in a Learning Health System
The role of the FAIR Guiding Principles in a Learning Health System
 
Acclerating biomedical discovery with an internet of FAIR data and services -...
Acclerating biomedical discovery with an internet of FAIR data and services -...Acclerating biomedical discovery with an internet of FAIR data and services -...
Acclerating biomedical discovery with an internet of FAIR data and services -...
 
Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...
Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...
Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...
 
Are we FAIR yet? And will it be worth it?
Are we FAIR yet? And will it be worth it?Are we FAIR yet? And will it be worth it?
Are we FAIR yet? And will it be worth it?
 
The Future of FAIR Data: An international social, legal and technological inf...
The Future of FAIR Data: An international social, legal and technological inf...The Future of FAIR Data: An international social, legal and technological inf...
The Future of FAIR Data: An international social, legal and technological inf...
 
Keynote at the 2018 Maastricht University Dinner
Keynote at the 2018 Maastricht University DinnerKeynote at the 2018 Maastricht University Dinner
Keynote at the 2018 Maastricht University Dinner
 
The future of science and business - a UM Star Lecture
The future of science and business - a UM Star LectureThe future of science and business - a UM Star Lecture
The future of science and business - a UM Star Lecture
 
Are we FAIR yet?
Are we FAIR yet?Are we FAIR yet?
Are we FAIR yet?
 
A Framework to develop the FAIR Metrics
A Framework to develop the FAIR MetricsA Framework to develop the FAIR Metrics
A Framework to develop the FAIR Metrics
 
FAIR principles and metrics for evaluation
FAIR principles and metrics for evaluationFAIR principles and metrics for evaluation
FAIR principles and metrics for evaluation
 
Towards metrics to assess and encourage FAIRness
Towards metrics to assess and encourage FAIRnessTowards metrics to assess and encourage FAIRness
Towards metrics to assess and encourage FAIRness
 
Ontologies
OntologiesOntologies
Ontologies
 
Making the most of phenotypes in ontology-based biomedical knowledge discovery
Making the most of phenotypes in ontology-based biomedical knowledge discoveryMaking the most of phenotypes in ontology-based biomedical knowledge discovery
Making the most of phenotypes in ontology-based biomedical knowledge discovery
 
1st Network-of-BioThings Hackathon
1st Network-of-BioThings Hackathon1st Network-of-BioThings Hackathon
1st Network-of-BioThings Hackathon
 

Recently uploaded

Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 

Recently uploaded (20)

Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 

Semantic approaches for biomedical knowledge discovery - Discovery Science 2014 Keynote

  • 1. Semantic Approaches for Biomedical Knowledge Discovery 1 Michel Dumontier, Ph.D. Associate Professor of Medicine (Biomedical Informatics) Stanford University @micheldumontier::DS:10-10-2014
  • 4. The unbelievable growth of scientific knowledge 4 @micheldumontier::DS:10-10-2014
  • 5. Thousands of databases curate the literature into consumable facts (problems: access, format, identifiers & linking) 5 @micheldumontier::DS:10-10-2014
  • 6. Software is needed to analyze, predict and evaluate (problems: OS, versioning, input/output formats) 6 @micheldumontier::DS:10-10-2014
  • 7. Ultimately, we develop fairly sophisticated programs/workflows to test our hypotheses 7 @micheldumontier::DS:10-10-2014
  • 8. Wouldn’t it be great if we could just find the evidence required to support or dispute a scientific hypothesis using the most up-to-date and relevant data, tools and scientific knowledge? 8 @micheldumontier::DS:10-10-2014
  • 9. So what do we need to achieve this? 1. Standards to construct and interrogate a massive, decentralized network of interconnected data and software 2. Methods and Tools – To prepare, interlink, and query data – To mine and discover associations – To identify novel, supported associations 3. Incentives and penalties – Funding agencies, journals, institutions, societies, conferences, workshops @micheldumontier::DS:10-10-20149
  • 10. The Semantic Web is the new global web of knowledge 10 @micheldumontier::DS:10-10-2014 standards for publishing, sharing and querying facts, expert knowledge and services scalable approach for the discovery of independently formulated and distributed knowledge
  • 11. we’re building a massive network of linked data 11 @micheldumontier::DS:10-10-2014Linking Open Data cloud diagram 2014, by Max Schmachtenberg, Christian Bizer, Anja Jentzsch and Richard Cyganiak. http://lod-cloud.net/"
  • 12. @micheldumontier::DS:10-10-2014 Linked Data for the Life Sciences 12 Bio2RDF is an open source project to unify the representation and interlinking of biological data using RDF. chemicals/drugs/formulations, genomes/genes/proteins, domains Interactions, complexes & pathways animal models and phenotypes Disease, genetic markers, treatments Terminologies & publications • Release 3 (June 2014): 11B+ interlinked statements from 35 biomedical datasets • dataset description, provenance & statistics • Partnerships with EBI, NCBI, DBCLS, NCBO, OpenPHACTS, and commercial tool providers
  • 13. Resource Description Framework • It’s a language to represent knowledge – Logic-based formalism -> automated reasoning – graph-like properties -> data analysis • Good for – Describing in terms of type, attributes, relations – Integrating data from different sources – Sharing the data (W3C standard) – Reusing what is available, developing what you need, and contributing back to the web of data. @micheldumontier::DS:10-10-201413
  • 15. The linked data network expands with every reference @micheldumontier::DS:10-10-2014 drugbank:DB00586 pharmgkb_vocabulary:Drug rdf:type rdfs:label diclofenac [drugbank:DB00586] pharmgkb:PA449293 drugbank_vocabulary:Drug pharmgkb_vocabulary:xref diclofenac [pharmgkb:PA449293] rdfs:label DrugBank PharmGKB 15
  • 19. Bio2RDF offers a highly connected network of data @micheldumontier::DS:10-10-201419
  • 20. Graph summarization for query formulation @micheldumontier::DS:10-10-2014 PREFIX drugbank_vocabulary: <http://bio2rdf.org/drugbank_vocabulary:> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> SELECT ?ddi ?d1name ?d2name WHERE { ?ddi a drugbank_vocabulary:Drug-Drug-Interaction . ?d1 drugbank_vocabulary:ddi-interactor-in ?ddi . ?d1 rdfs:label ?d1name . ?d2 drugbank_vocabulary:ddi-interactor-in ?ddi . ?d2 rdfs:label ?d2name. FILTER (?d1 != ?d2) } 20
  • 21. You can use query assistants @micheldumontier::DS:10-10-2014 http://sindicetech.com/sindice-suite/sparqled/ graph: http://sindicetech.com/analytics21
  • 22. Federated Queries over Independent SPARQL EndPoints Get all protein catabolic processes (and more specific) in biomodels SELECT ?go ?label count(distinct ?x) WHERE { service <http://bioportal.bio2rdf.org/sparql> { ?go rdfs:label ?label . ?go rdfs:subClassOf ?tgo ?tgo rdfs:label ?tlabel . FILTER regex(?tlabel, "^protein catabolic process") } service <http://biomodels.bio2rdf.org/sparql> { ?x <http://bio2rdf.org/biopax_vocabulary:identical-to> ?go . ?x a <http://www.biopax.org/release/biopax-level3.owl#BiochemicalReaction> . } } @micheldumontier::DS:10-10-201422
  • 24. Despite all the data, it’s still hard to find answers to questions Because there are many ways to represent the same data and each dataset represents it differently @micheldumontier::DS:10-10-201424
  • 25. multiple formalizations of the same kind of data do emerge, each with their own merit @micheldumontier::DS:10-10-201425
  • 26. Massive Proliferation of Ontologies / Vocabularies @micheldumontier::DS:10-10-201426
  • 27. Multi-Stakeholder Efforts to Standardize Representations are Reasonable, Long Term Strategies for Data Integration @micheldumontier::DS:10-10-201427
  • 28. uniprot:P05067 uniprot:Protein is a sio:gene is a is a Semantic data integration, consistency checking and query answering over Bio2RDF with the Semanticscience Integrated Ontology (SIO) dataset ontology Knowledge Base @micheldumontier::DS:10-10-2014 pharmgkb:PA30917 refseq:Protein is a is a omim:189931 omim:Gene pharmgkb:Gene Querying Bio2RDF Linked Open Data with a Global Schema. Alison Callahan, José Cruz-Toledo and Michel Dumontier. Bio-ontologies 2012. 28
  • 29. @micheldumontier::DS:10-10-201429 SRIQ(D) 10700+ axioms 1300+ classes 201 object properties (inc. inverses) 1 datatype property
  • 30. Bio2RDF and SIO powered SPARQL 1.1 federated query: Find chemicals (from CTD) and proteins (from SGD) that participate in the same process (from GOA) SELECT ?chem, ?prot, ?proc FROM <http://bio2rdf.org/ctd> WHERE { SERVICE <http://ctd.bio2rdf.org/sparql> { ?chemical a sio:chemical-entity. ?chemical rdfs:label ?chem. ?chemical sio:is-participant-in ?process. ?process rdfs:label ?proc. FILTER regex (?process, "http://bio2rdf.org/go:") } SERVICE <http://sgd.bio2rdf.org/sparql> { ?protein a sio:protein . ?protein sio:is-participant-in ?process. ?protein rdfs:label ?prot . } } @micheldumontier::DS:10-10-201430
  • 31. tactical formalization @micheldumontier::DS:10-10-201431 Take what you need and represent it in a way that directly serves your objective STANDARDSUSER DRIVEN REPRESENTATION identifying aberrant and pharmacological pathways predicting drug targets using organism phenotypes Biopax-pathway exploration FALDO-powered genome navigation
  • 32. aberrant and pharmacological pathways Q1. Can we identify pathways that are associated with a particular disease or class of diseases? Q2. Can we identify pathways are associated with a particular drug or class of drugs? drug pathway disease @micheldumontier::DS:10-10-201432 gene
  • 33. Identification of drug and disease enriched pathways • Approach – Integrate 3 datasets • DrugBank, PharmGKB and CTD – Integrate 7 terminologies • MeSH, ATC, ChEBI, UMLS, SNOMED, ICD, DO – Formalize data of interest – Identify significant associations using enrichment analysis over the fully inferred knowledge base Identifying aberrant pathways through integrated analysis of knowledge in pharmacogenomics. Bioinformatics. 2012. @micheldumontier::DS:10-10-201433
  • 34. Formal knowledge representation as a strategy for data integration @micheldumontier::DS:10-10-201434
  • 35. Have you heard of OWL? @micheldumontier::DS:10-10-201435
  • 36. mercaptopurine [pharmgkb:PA450379] mercaptopurine [drugbank:DB01033] purine-6-thiol [CHEBI:2208] Class Equivalence mercaptopurine [ATC:L01BB02] Top Level Classes (disjointness) drug diseasegenepathway property chains Class subsumption mercaptopurine [mesh:D015122] Reciprocal Existentials drug disease pathway gene Formalized as an OWL-EL ontology 650,000+ classes, 3.2M subClassOf axioms, 75,000 equivalentClass axioms @micheldumontier::DS:10-10-201436
  • 37. Benefits: Enhanced Query Capability – Use any mapped terminology to query a target resource. – Use knowledge in target ontologies to formulate more precise questions • ask for drugs that are associated with diseases of the joint: ‘Chikungunya’ (do:0050012) is defined as a viral infectious disease located in the ‘joint’ (fma:7490) and caused by a ‘Chikungunya virus’ (taxon:37124). – Learn relationships that are inferred by automated reasoning. • alcohol (ChEBI:30879) is associated with alcoholism (PA443309) since alcoholism is directly associated with ethanol (CHEBI:16236) • ‘parasitic infectious disease’ (do:0001398) associated with 129 drugs, 15 more than are directly linked. @micheldumontier::DS:10-10-201437
  • 38. Knowledge Discovery through Data Integration and Enrichment Analysis • OntoFunc: Tool to discover significant associations between sets of objects and ontology categories. enrichment of attribute among a selected set of input items as compared to a reference set. hypergeometric or the binomial distribution, Fisher's exact test, or a chi-square test. • We found 22,653 disease-pathway associations, where for each pathway we find genes that are linked to disease. – Mood disorder (do:3324) associated with Zidovudine Pathway (pharmgkb:PA165859361). Zidovudine is for treating HIV/AIDS. Side effects include fatigue, headache, myalgia, malaise and anorexia • We found 13,826 pathway-chemical associations – Clopidogrel (chebi:37941) associated with Endothelin signaling pathway (pharmgkb:PA164728163). Endothelins are proteins that constrict blood vessels and raise blood pressure. Clopidogrel inhibits platelet aggregation and prolongs bleeding time. @micheldumontier::DS:10-10-201438
  • 39. Tactical Formalization + Automated Reasoning Offers Compelling Value for Certain Problems We need to be smart about the goal, and how best to achieve it. Tactical formalization is another tool in the toolbox. We’ve formalized data as OWL ontologies to verify, fix and exploit Linked Data through expressive OWL reasoning • To identify mistakes in human curated knowledge • To identify conflicting meaning in terms • To identify mistakes in the representation of RDF data o incorrect use of relations o incorrect assertion of identity (owl:sameAs) Many other applications can be envisioned. @micheldumontier::DS:10-10-201439
  • 40. PhenomeDrug A computational approach to predict drug targets, drug effects, and drug indications using phenotypes @micheldumontier::DS:10-10-201440 Mouse model phenotypes provide information about human drug targets. Hoehndorf R, Hiebert T, Hardy NW, Schofield PN, Gkoutos GV, Dumontier M. Bioinformatics. 2013.
  • 41. animal models provide insight for on target effects • In the majority of 100 best selling drugs ($148B in US alone), there is a direct correlation between knockout phenotype and drug effect • Immunological Indications – Anti-histamines (Claritin, Allegra, Zyrtec) – KO of histamine H1 receptor leads to decreased responsiveness of immune system – Predicts on target effects : drowsiness, reduced anxiety @micheldumontier::DS:10-10-201441 Zambrowicz and Sands. Nat Rev Drug Disc. 2003.
  • 42. Identifying drug targets from mouse knock-out phenotypes @micheldumontier::DS:10-10-201442 drug gene phenotypes effects human gene non-functional gene model ortholog similar inhibits Main idea: if a drug’s phenotypes matches the phenotypes of a null model, this suggests that the drug is an inhibitor of the gene
  • 43. Terminological Interoperability (we must compare apples with apples) Mouse Phenotypes Drug effects (mappings from UMLS to DO, NBO, MP) Mammalian Phenotype OntologyPhenomeNet PhenomeDrug @micheldumontier::DS:10-10-2014 poor coordination decreased gut peristalsis axon degeneration decreased stride length erotypic ehavior Abnormal EEG failure to find food Unstable posture Constipation Neuronal loss in Substantia Nigra Shuffling gait Resting tremors REM disorder Hyposmia poor rotarod performance decreased gut peristalsis axon degeneration decreased stride length sterotypic behavior abnormal EEG failure to find food abnormal coordination abnormal digestive physiology CNS neuron degeneration abnormal locomotion abnormal motor function sleep disturbance abnormal olfaction
  • 44. Semantic Similarity @micheldumontier::DS:10-10-201445 Given a drug effect profile D and a mouse model M, we compute the semantic similarity as an information weighted Jaccard metric. The similarity measure used is non-symmetrical and determines the amount of information about a drug effect profile D that is covered by a set of mouse model phenotypes M.
  • 45. Loss of function models predict targets of inhibitor drugs • 14,682 drugs; 7,255 mouse genotypes • Validation against known and predicted inhibitor-target pairs – 0.76 ROC AUC for human targets (DrugBank) – 0.81 ROC AUC for mouse targets (STITCH) • diclofenac (STITCH:000003032) – NSAID used to treat pain, osteoarthritis and rheumatoid arthritis – Drug effects include liver inflammation (hepatitis), swelling of liver (hepatomegaly), redness of skin (erythema) – 49% explained by PPARg knockout • peroxisome proliferator activated receptor gamma (PPARg) regulates metabolism, proliferation, inflammation and differentiation, • Diclofenac is a known inhibitor – 46% explained by COX-2 knockout • Diclofenac is a known inhibitor @micheldumontier::DS:10-10-2014
  • 47. Using the Semantic Web to Gather Evidence for Scientific Hypotheses What evidence supports or disputes that TKIs are cardiotoxic? @micheldumontier::DS:10-10-201448
  • 48. • Tyrosine Kinase Inhibitors (TKI) – Imatinib, Sorafenib, Sunitinib, Dasatinib, Nilotinib, Lapatinib – Used to treat cancer – Linked to cardiotoxicity. • FDA launched drug safety program to detect toxicity – Need to integrate data and ontologies (Abernethy, CPT 2011) – Abernethy (2013) suggest using public data in genetics, pharmacology, toxicology, systems biology, to predict/validate adverse events • What evidence could we gather to give credence that TKI’s causes non-QT cardiotoxicity? @micheldumontier::DS:10-10-201449 FDA Use Case: TKI non-QT Cardiotoxicity
  • 49. @micheldumontier::DS:10-10-201450 Jane P.F. Bai and Darrell R. Abernethy. Systems Pharmacology to Predict Drug Toxicity: Integration Across Levels of Biological Organization. Annu. Rev. Pharmacol. Toxicol. 2013.53:451-473
  • 50. • The goal of HyQue is retrieve and evaluate evidence that supports/disputes a hypothesis – hypotheses are described as a set of events • e.g. binding, inhibition, phenotypic effect – events are associated with types of evidence • a query is written to retrieve data • a weight is assigned to provide significance • Hypotheses are written by people who seek answers • data retrieval rules are written by people who know the data and how it should be interpreted  @micheldumontier::DS:10-10-201451 HyQue 1. HyQue: Evaluating hypotheses using Semantic Web technologies. J Biomed Semantics. 2011 May 17;2 Suppl 2:S3. 2. Evaluating scientific hypotheses using the SPARQL Inferencing Notation. Extended Semantic Web Conference (ESWC 2012). Heraklion, Crete. May 27-31, 2012.
  • 51. HyQue: A Semantic Web Application @micheldumontier::DS:10-10- 52 Software OntologiesData Hypothesis Evaluation
  • 52. What evidence might we gather? • clinical: Are there cardiotoxic effects associated with the drug? – Literature (studies) [curated db] – Product labels (studies) [r3:sider] – Clinical trials (studies) [r3:clinicaltrials] – Adverse event reports [r2:pharmgkb/onesides] – Electronic health records (observations) • pre-clinical associations: – genotype-phenotype (null/disease models) [r2:mgi, r2:sgd; r3:wormbase] – in vitro assays (IC50) [r3:chembl] – drug targets [r2:drugbank; r2:ctd; r3:stitch] – drug-gene expression [r3:gxa] – pathways [r2:kegg; r3:reactome] – Drug-pathway, disease-pathway enrichments [aberrant pathways] – Chemical properties [r2:pubchem; r2.drugbank] – Toxicology [r1.toxkb/cebs] @micheldumontier::DS:10-10-201453
  • 53. Data retrieval is done with SPARQL @micheldumontier::DS:10-10-201454
  • 54. Data Evaluation is done with SPIN rules @micheldumontier::DS:10-10-201455
  • 60. In Summary • This talk was about making sense and using the structured data we already have • RDF-based Linked Open Data acts as a substrate for query answering and task-based formalization in OWL • Discovery through the generation of testable hypotheses in the target domain. • Using Linked Data to evaluate scientific hypotheses @micheldumontier::DS:10-10-201461
  • 61. Looking to the Future • Community guidelines for RDF-based data and dataset descriptions (e.g. CEDAR) • Alignment and consolidation of OWL ontologies (e.g. UMLS) • Identifying and filling gaps in our knowledge (e.g. Adam the Robot scientist) • Improving our coverage of available evidence (e.g. HyQue) • More sophisticated data mining (e.g. you!) @micheldumontier::DS:10-10-201462
  • 62. Acknowledgements Bio2RDF Release 2: Allison Callahan, Jose Cruz-Toledo, Peter Ansell Aberrant Pathways: Robert Hoehndorf, Georgios Gkoutos PhenomeDrug: Tanya Hiebert, Robert Hoehndorf, Georgios Gkoutos, Paul Schofield TKI Cardiotoxicity: Alison Callahan, Tania Hiebert, Beatriz Lujan, Sira Sarntivijai (FDA) @micheldumontier::DS:10-10-201463