1. The Translational Medicine
Ontology and Knowledge Base:
Using Semantic Web Technology in
Personalized Medicine for
Data Integration
Joanne S. Luciano
Research Associate Professor
Tetherless World Constellation
Rensselaer Polytechnic Institute, Troy, NY USA
2. Translational Medicine
“...the process by which the results of
research done in the laboratory are
directly used to develop new ways to treat
patients and research done outside the
laboratory is used to inform laboratory
research”
3. World Wide Web
• World Wide Web (WWW) - a system of extensively
interlinked hypertext documents
• HTML Hype-Text Markup Language - the standard
protocol for formatting and displaying documents
on the world Wide web.
• Hyper-Text Transfer Protocol (HTTP) - a protocol to
transfer hypertext requests and information
between servers and browsers.
http://dictionary.reference.com/
4. Semantic Web
• Giant Global Graph (GGG) the content plus
pointers transitioning to content plus pointers
plus relationships plus descriptions.
• Starting with RDF, OWL and SPARQL.
• Not magic bullets, but the tools which allow us
to break free of the document layer.
http://dictionary.reference.com/browse/www
http://dig.csail.mit.edu/breadcrumbs/node/215
5. Personalized Medicine
• We are moving towards a system of
personalized medicine.
• This requires more data availability and
smarter systems on all sides.
• NOW IMAGINE – if data were as
easily accessible as web pages
6. Assembly of Knowledge
• Identify it AD patient
scenario
(timeline)
Medical
Experts
(serving a
• Access it
Role) Top down
provides identify
motivated by focus for relevant
• Get it
concepts
Initial TMO Semantic Existing
• Assemble it
Query Ontology Extensions Ontologies
Sketches (OBO, RO, BFO)
• Make sense
Bottom-up described
with
out of it Final
Executable applied to
Additional
Data needed
for Use Case
Existing Data
Queries
Queries Data
7. Personalized Medicine
Questions are Complex
• Understand disease heterogeneity
• Comprehend disease progression
• Determine genetic and environmental contributors
• Create treatments against relevant targets
– Drugs against relevant targets (molecular structures)
– Yoga against stress
– Exercise against obesity
– Elimination diet against food intolerance or allergy
• Develop markers to predict response
• Identify concrete endpoints to measure response
8. Personalized Medicine
Data are Complex
Need an integrative data environment to answer
scientific questions
– Patient data
• Genetics, epigenetics, expression,
environment, phenotype, demographic
– Treatment data
• Existing drugs, mechanisms of action
– Disease data
• Human and animal models
– Standard of care
• Diagnostic guidelines
Data split up in many different places (bug or feature?)
9. W3C’s HCLS
Interest Group
Mission:
…use of Semantic Web
technologies for health care
and life science, with focus
on …
translational medicine.
These domains stand to
gain tremendous benefit
…as they depend on the
interoperability of
information from many
domains and processes for
efficient decision support.
10. Participants
Bosse Andersson, AstraZeneca Scott Marshall, Leiden University
Medical Center
Colin Batchelor, RSC
Jim McCusker, RPI
Olivier Bodenreider, NIH
Deborah McGuiness, RPI
Tim Clark, HMS
Jim McGurk, Daiichi Sankyo
Christi Denney, Eli Lilly
Chimezie Ogbuji, Cleveland Clinic
Christopher Domarew, Albany Medical
Center Elgar Pichler, AstraZeneca
Michel Dumontier, Carleton University Bob Powers, Predictive Medicine
Thomas Gambet, W3C Eric Prud'hommeaux, W3C
Lee Harland, Pfizer Matthias Samwald, DERI
Anja Jentzsch, Free University Berlin Lynn Schriml, University of Maryland
Vipul Kashyap, Cigna Susie Stephens, Johnson & Johnson
Pharmaceutical R&D
Peter Kos, HMS
Peter Tonellato, HMS
Julia Kozlovsky, AstraZeneca
Trish Whetzel, Stanford
Timothy Lebo, RPI
Jun Zhao, Oxford University
Joanne Luciano, RPI
11. Alzheimer’s Disease
Scenario 1
• Patient visits clinician who enters
symptoms into EHR.
• Physician does a differential diagnosis
with working diagnosis of AD.
• Physician arranges for a battery of
tests, all entered into EHR.
12. Mapping Terms to
Existing Ontologies
Identify key terms and look for standard
ontology that contains that term
In Patient Scenario Step 1,
map the word “patient" to the “patient role”
in the Ontology for Biomedical Investigation
(OBI) ontology [OBI:0000093]
“Physician” to the NCI Thesaurus term
“Physician”
14. Discovery
Questions and Answers
Questions Answers
What genes are associated with or Diseasome and PharmGKB indicate at
implicated in AD? least 97 genes have some association
with AD.
Which SNPs may be potential AD PharmGKB reveals 63 SNPs.
biomarkers?
Which market drugs might 57 compounds or classes of
potentially be repurposed for AD compounds are used to treat 45
because they modulate AD implicated diseases, including AD, diabetes,
genes? obesity, and hyper/hypotension
15. Clinical Trials
Questions and Answers
Questions Answers
Since my patient is suffering from drug- Of the 438 drugs linked to AD trials,
induced side effects for AD treatment, only 58 are in active trials and only 2
can an AD clinical trial with a different (Doxorubicin and IL-2) have a
mechanism of action be identified? documented mechanism of action. 78
AD-associated drugs have an
established MOA.
Find AD patients without the APOE4 Of the 4 patients with AD, only one
allele as these would be good does not carry the APOE4 allele, and
candidates for the clinical trial involving may be a good candidate for the
Bapineuzumab? clinical trial.
What active trials are ongoing that would 58 Alzheimer trials, 2 mild cognitive
be a good fit for Patient 2? impairment trials, 1
hypercholesterolaemia trial, 66
myocardial infarction trials, 46 anxiety
trials, and 126 depression trials.
16. Physician
Questions and Answers
Questions Answers
What are the diagnostic criteria There are 12 diagnostic inclusion
for AD? criteria and 9 exclusion criteria
Does Medicare D cover Medicare D covers two brand
Dopenezil? name formulations of Donepezil:
Aricept and Aricept ODT.
Have any AD patients been Patient 2 was found to suffer
treated for other neurological from AD and depression.
conditions?
17. Summary
The data landscape for personalized medicine is
highly fragmented
Many domain specific terminologies and ontologies
exist
Enabled connection of domain specific ontologies
through a high level BFO compliant ontology
A TMKB has been created that demonstrates the
proof of concept
Make your data as accessible as web pages.
See the CSHALS or Data.Gov
18. Thank you!
• Paper: TMO/TMKB (in press): http://bit.ly/fjPV5g
http://www.w3.org/wiki/HCLSIG/PharmaOntology/Publications
• Ontology: http://bit.ly/hJ7r4W
http://code.google.com/p/translationalmedicineontology/
• Use Cases: (Scenarios) http://bit.ly/evVtmt
http://www.w3.org/wiki/HCLSIG/PharmaOntology/UseCases
• Knowledge Base: http://bit.ly/ef2WLJ
http://www.w3.org/wiki/HCLSIG/PharmaOntology/TMKB
• Wiki: http://www.w3.org/wiki/HCLSIG/PharmaOntology
• Conference on Semantics in Health Care and Life Science
(CSHALS): http://www.iscb.org/cshals2011
• Semantic Health Care and Life Science Tutorial:
http://sparql.tw.rpi.edu/
20. Alzheimer’s Disease
Scenario 2
• Physician performs cognitive tests and
confirms AD diagnosis.
• Physician selects appropriate drug,
aided by the ontology.
• Physician prescribes a drug.
• Physician has follow-up visit.
21. Alzheimer’s Disease
Scenario 3
• Physician may investigate various
clinical trials for the patient.
• Physician may enroll patient in trial.
• Patient record updated.
22. TMO Query
How many patients experienced side effects while taking Donepezil?
This is a graphic
representation
of the question
24. Data Sources
• Disparate data sources
– clinicaltrials.gov, DailyMed, Diseaseome,
DrugBank, LinkedCT, Medicare, SIDER
• Constructed AD diagnostic criteria.
• Seven synthetic patient records.
– Demographic, contact, family, life style,
allergies, etc.
– Typical of a patient record
25. Future Directions
• Expand patient record representation
• Develop the representation of genetic
variation and pharmacogenetics
• Investigate animal models for disease
and capture treatment outcomes
• Explore integration with i2b2/
tranSMART
26. Data Challenges
• Patient data split across eHRs, clinical trial systems,
genetic testing vendors, and longitudinal studies
• Drug information split across systems such as the
Orange Book, DrugBank, ClinicalTrials.gov,
DailyMed, SIDER, PharmGKB, formulary lists
• Disease information split across OMIM, GEO,
commercial databases
• Different data representation approaches used by
different communities
• No unifying schema to pull data together