The Translational MedicineOntology and Knowledge Base:Using Semantic Web Technology inPersonalized Medicine forData IntegrationJoanne S. LucianoResearch Associate ProfessorTetherless World ConstellationRensselaer Polytechnic Institute, Troy, NY USA
Translational Medicine“...the process by which the results ofresearch done in the laboratory aredirectly used to develop new ways to treatpatients and research done outside thelaboratory is used to inform laboratoryresearch”
World Wide Web• World Wide Web (WWW) - a system of extensively interlinked hypertext documents• HTML Hype-Text Markup Language - the standard protocol for formatting and displaying documents on the world Wide web.• Hyper-Text Transfer Protocol (HTTP) - a protocol to transfer hypertext requests and information between servers and browsers.http://dictionary.reference.com/
Semantic Web • Giant Global Graph (GGG) the content plus pointers transitioning to content plus pointers plus relationships plus descriptions. • Starting with RDF, OWL and SPARQL. • Not magic bullets, but the tools which allow us to break free of the document layer.http://dictionary.reference.com/browse/wwwhttp://dig.csail.mit.edu/breadcrumbs/node/215
Personalized Medicine• We are moving towards a system of personalized medicine.• This requires more data availability and smarter systems on all sides.• NOW IMAGINE – if data were as easily accessible as web pages
Assembly of Knowledge• Identify it AD patient scenario (timeline) Medical Experts (serving a• Access it Role) Top down provides identify motivated by focus for relevant• Get it concepts Initial TMO Semantic Existing• Assemble it Query Ontology Extensions Ontologies Sketches (OBO, RO, BFO)• Make sense Bottom-up described with out of it Final Executable applied to Additional Data needed for Use Case Existing Data Queries Queries Data
Personalized Medicine Questions are Complex• Understand disease heterogeneity• Comprehend disease progression• Determine genetic and environmental contributors• Create treatments against relevant targets – Drugs against relevant targets (molecular structures) – Yoga against stress – Exercise against obesity – Elimination diet against food intolerance or allergy• Develop markers to predict response• Identify concrete endpoints to measure response
Personalized Medicine Data are Complex Need an integrative data environment to answer scientific questions – Patient data • Genetics, epigenetics, expression, environment, phenotype, demographic – Treatment data • Existing drugs, mechanisms of action – Disease data • Human and animal models – Standard of care • Diagnostic guidelinesData split up in many different places (bug or feature?)
W3C’s HCLSInterest GroupMission:…use of Semantic Webtechnologies for health careand life science, with focuson …translational medicine.These domains stand togain tremendous benefit…as they depend on theinteroperability ofinformation from manydomains and processes forefficient decision support.
ParticipantsBosse Andersson, AstraZeneca Scott Marshall, Leiden University Medical CenterColin Batchelor, RSC Jim McCusker, RPIOlivier Bodenreider, NIH Deborah McGuiness, RPITim Clark, HMS Jim McGurk, Daiichi SankyoChristi Denney, Eli Lilly Chimezie Ogbuji, Cleveland ClinicChristopher Domarew, Albany Medical Center Elgar Pichler, AstraZenecaMichel Dumontier, Carleton University Bob Powers, Predictive MedicineThomas Gambet, W3C Eric Prudhommeaux, W3CLee Harland, Pfizer Matthias Samwald, DERIAnja Jentzsch, Free University Berlin Lynn Schriml, University of MarylandVipul Kashyap, Cigna Susie Stephens, Johnson & Johnson Pharmaceutical R&DPeter Kos, HMS Peter Tonellato, HMSJulia Kozlovsky, AstraZeneca Trish Whetzel, StanfordTimothy Lebo, RPI Jun Zhao, Oxford UniversityJoanne Luciano, RPI
Alzheimer’s Disease Scenario 1• Patient visits clinician who enters symptoms into EHR.• Physician does a differential diagnosis with working diagnosis of AD.• Physician arranges for a battery of tests, all entered into EHR.
Mapping Terms to Existing OntologiesIdentify key terms and look for standardontology that contains that term In Patient Scenario Step 1, map the word “patient" to the “patient role” in the Ontology for Biomedical Investigation (OBI) ontology [OBI:0000093] “Physician” to the NCI Thesaurus term “Physician”
Translational Medicine Knowledge Base Terms (ontologies) Translation (linking) Data
Discovery Questions and Answers Questions AnswersWhat genes are associated with or Diseasome and PharmGKB indicate atimplicated in AD? least 97 genes have some association with AD.Which SNPs may be potential AD PharmGKB reveals 63 SNPs.biomarkers?Which market drugs might 57 compounds or classes ofpotentially be repurposed for AD compounds are used to treat 45because they modulate AD implicated diseases, including AD, diabetes,genes? obesity, and hyper/hypotension
Clinical Trials Questions and Answers Questions AnswersSince my patient is suffering from drug- Of the 438 drugs linked to AD trials,induced side effects for AD treatment, only 58 are in active trials and only 2can an AD clinical trial with a different (Doxorubicin and IL-2) have amechanism of action be identified? documented mechanism of action. 78 AD-associated drugs have an established MOA.Find AD patients without the APOE4 Of the 4 patients with AD, only oneallele as these would be good does not carry the APOE4 allele, andcandidates for the clinical trial involving may be a good candidate for theBapineuzumab? clinical trial.What active trials are ongoing that would 58 Alzheimer trials, 2 mild cognitivebe a good fit for Patient 2? impairment trials, 1 hypercholesterolaemia trial, 66 myocardial infarction trials, 46 anxiety trials, and 126 depression trials.
Physician Questions and Answers Questions AnswersWhat are the diagnostic criteria There are 12 diagnostic inclusionfor AD? criteria and 9 exclusion criteriaDoes Medicare D cover Medicare D covers two brandDopenezil? name formulations of Donepezil: Aricept and Aricept ODT.Have any AD patients been Patient 2 was found to suffertreated for other neurological from AD and depression.conditions?
SummaryThe data landscape for personalized medicine is highly fragmentedMany domain specific terminologies and ontologies existEnabled connection of domain specific ontologies through a high level BFO compliant ontologyA TMKB has been created that demonstrates the proof of conceptMake your data as accessible as web pages. See the CSHALS or Data.Gov
Thank you!• Paper: TMO/TMKB (in press): http://bit.ly/fjPV5g http://www.w3.org/wiki/HCLSIG/PharmaOntology/Publications• Ontology: http://bit.ly/hJ7r4W http://code.google.com/p/translationalmedicineontology/• Use Cases: (Scenarios) http://bit.ly/evVtmt http://www.w3.org/wiki/HCLSIG/PharmaOntology/UseCases• Knowledge Base: http://bit.ly/ef2WLJ http://www.w3.org/wiki/HCLSIG/PharmaOntology/TMKB• Wiki: http://www.w3.org/wiki/HCLSIG/PharmaOntology• Conference on Semantics in Health Care and Life Science (CSHALS): http://www.iscb.org/cshals2011• Semantic Health Care and Life Science Tutorial: http://sparql.tw.rpi.edu/
Alzheimer’s Disease Scenario 2• Physician performs cognitive tests and confirms AD diagnosis.• Physician selects appropriate drug, aided by the ontology.• Physician prescribes a drug.• Physician has follow-up visit.
Alzheimer’s Disease Scenario 3• Physician may investigate various clinical trials for the patient.• Physician may enroll patient in trial.• Patient record updated.
TMO QueryHow many patients experienced side effects while taking Donepezil? This is a graphic representation of the question
Data Sources• Disparate data sources – clinicaltrials.gov, DailyMed, Diseaseome, DrugBank, LinkedCT, Medicare, SIDER• Constructed AD diagnostic criteria.• Seven synthetic patient records. – Demographic, contact, family, life style, allergies, etc. – Typical of a patient record
Future Directions• Expand patient record representation• Develop the representation of genetic variation and pharmacogenetics• Investigate animal models for disease and capture treatment outcomes• Explore integration with i2b2/ tranSMART
Data Challenges• Patient data split across eHRs, clinical trial systems, genetic testing vendors, and longitudinal studies• Drug information split across systems such as the Orange Book, DrugBank, ClinicalTrials.gov, DailyMed, SIDER, PharmGKB, formulary lists• Disease information split across OMIM, GEO, commercial databases• Different data representation approaches used by different communities• No unifying schema to pull data together