How to Troubleshoot Apps for the Modern Connected Worker
Knowledge Discovery using an Integrated Semantic Web
1. Knowledge Discovery using an Integrated
Semantic Web
Michel Dumontier
Department of Biology, School of Computer Science, Institute of Biochemistry
Ottawa Institute for Systems Biology
Ottawa-Carleton Institute for Biomedical Engineering
Carleton University
Ottawa, Canada
Chair, W3C Semantic Web for Health Care and Life Sciences Interest Group
1 BH2012
4. uncovering a sufficient amount of evidence to support/refute
a hypothesis is becoming increasingly difficult
it requires a lot of digging around
4 BH2012
5. continuous growth in research literature
Source:http://www.nlm.nih.gov/bsd/stats/cit_added.html
5 BH2012
9. What if we could just pose a hypothesis
and have a system automatically use
9
available data, ontologies and services?
BH2012
10. HyQue
HyQue is the Hypothesis query and evaluation system
• A platform for knowledge discovery
• Facilitates hypothesis formulation and evaluation
• Leverages Semantic Web technologies to provide access to
facts, expert knowledge and web services
• Conforms to a simplified event-based model
• Supports evaluation against positive and negative findings
• Transparent and reproducible evidence prioritization
• Provenance of across all elements of hypothesis testing
– trace a hypothesis to its evaluation, including the data and rules used
Evaluating scientific hypotheses using the SPARQL Inferencing Notation. Extended Semantic Web Conference
(ESWC 2012). Heraklion, Crete. May 27-31, 2012.
HyQue: evaluating hypotheses using Semantic Web technologies. J Biomed Semantics. 2011 May 17;2 Suppl 2:S3.
10 BH2012
12. Event-based data model
HyQue events denote a phenomenon involving two
objects: ‘agent’ and ‘target’ . In addition, we can specify the
location of this event (e.g. located in nucleus, or under
some genetic background)
supported events
Event 1. protein-protein binding
‘has agent’ agent 2. protein-nucleic acid binding
‘has target’ target 3. molecular activation
‘is located in’ location
4. molecular inhibition
5. gene induction
‘is negated’ boolean
6. gene repression
7. transport
12 BH2012
13. HyQue domain rules CALCULATE a quantitative
measure of evidence for an event
‘induce’ rule (maximum score: 5):
– Is event negated? GO:0010628
• If yes, subtract 2
– Is event of type ‘induce’? CHEBI:36080
• If yes, add 1; if no, subtract 1
– Is agent of type ‘protein’ or ‘RNA’?
• If yes, add 1; if type ‘gene’, subtract 1
– Is target of type ‘gene’? SO:0000236
• If yes, add 1; if no, subtract 1
– Does agent have known ‘transcription factor activity’?
• If yes, add 1 GO:0003700
– Is event located in the ‘nucleus’?
• If yes, add 1; if no, subtract 1
GO:0005634
13 BH2012
15. The Semantic Web
is the new global web of knowledge
It involves standards for publishing, sharing and querying
facts, expert knowledge and services
It is a scalable approach to the
discovery of independently formulated
and distributed knowledge
15 BH2012
16. something you can search,
lookup, link to, check
consistency of, and query for
16 BH2012
17. An ever expanding web of linked data
17 “Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/” BH2012
18. Bio2RDF provides a
simple convention
and infrastructure to
provide linked data
for the life sciences
18 BH2012
19. linked data for the life sciences
An Open Source Project for the Provision of
Scalable, Decentralized Data with Global Mirroring
and Customizable Query Resolution
http://bio2rdf.org/ns:id Laval University, Carleton University,
Queensland University of Technology
19 BH2012
25. More sophisticated OWL-based Data Integration,
Consistency Checking and Discovery
Robert Hoehndorf
• Checking the consistency of semantic annotations [1]
– Formalized semantic annotations in SBML models as OWL axioms.
Automated reasoning uncovered inconsistencies in 16 models.
• e.g. alpha-D-glucose phosphate is not the required ATP in an ATP-dependent
reaction (GO + ChEBI + disjoint + closure axioms)
• Finding significant biomedical associations [2] (initiated at BH11)
– found significant associations between genes, drugs, diseases and
pathways using Drugbank, PharmGKB, CTD, PID across categories
of drugs (ChEBI, ATC, MeSH) and diseases (DO, MeSH)
– 22,653 pathway-disease type associations (6304 over; 16,349 under)
• carcinosarcoma (DOID:4236) and (HIV RT) Zidovudine Pathway
(PharmGKB:PA165859361)
– 13,826 pathway-chemical type associations (12,564 over; 1262 under)
• drug clopidogrel (CHEBI:37941) with Endothelin signaling pathway
(PharmGKB:PA164728163) -> (smooth muscle mitogenesis)
http://pharmgkb-owl.googlecode.com
1. Integrating systems biology models and biomedical ontologies. BMC Systems Biology. 2011. 5 : 124
2. Identifying aberrant pathways through integrated analysis of knowledge in pharmacogenomics. Bioinformatics. 2012. in press
25 BH2012
26. Personal Health Lens
Mark Wilkinson
Chris Baker
Observation: Patients often look up new/alternative drugs to treat their
condition or alleviate side effects.
Opportunity: A patient-centric health care application that identifies
contraindications for drugs mentioned on web pages using the patient’s
own health data
Components:
• RDFized patient data
• Bio2RDF semantically annotated data
• SADI semantic web services to process the page and retrieve data
• SHARE automatic workflow composition
26 BH2012
28. Matthias Samwald
We are developing a simple, cheap and ubiquitous solutions for
anchoring pharmacogenomics in medical practice
Curated and unified
set of essential 385+
markers, 50+
pharmacogenes and
rulesystem unified
under one
standardized model:
The Medicine Safety
Code
W3C Task Force: Clinical Decision
Support for Personalized Medicine
28 BH2012