Connecting the dots: drug information and Linked Data


Published on

Presented as part of the AMIA2014 Knowledge Representation + Semantics and
Clinical Information Systems Working Groups Pre-Symposium "Drug
Terminology Standards: Meaningful Use and Better Knowledge"

November 16, 2014
Washington, DC

Published in: Health & Medicine
  1. 1. Connec&ng the dots: drug informa&on and Linked Data Tomasz Adamusiak MD PhD 7omasz
  2. 2. Conflict of interest disclosure • Tomasz Adamusiak is a Senior Data Scien&st at Thomson Reuters, provider of intelligent informa&on for pharma and research ins&tu&ons
  3. 3. Tomasz Adamusiak MD PhD • Former NLM Fellow and bioinforma&cian at EBI
  4. 4. Learning Objec&ves • Describe Linked Data and and seman&c content integra&on technologies • Recognize the value of integra&ng drug informa&on with public resources
  6. 6. 2.5 exabytes ≈ 7 000 Libraries of Congress By Carol M. Highsmith (Own work) [CC-­‐BY-­‐SA-­‐3.0]
  7. 7. 2.5 exabytes ≈ 7 000 Libraries of Congress
  8. 8. Tim Berners-­‐Lee: the next Web of open, linked data If you want to put something on the web there are three rules: 1. All kinds of conceptual things, they have names now that start with HTTP. 2. If I take one of these HTTP names and I look it up [...] I fetch the data using the HTTP protocol from the web, I will get back some data in a standard format 3. It's got rela5onships [..] the other thing that it's related to is given one of those names that starts HTTP. So, I can go ahead and look that thing up. Sir Tim Berners-­‐Lee on the next Web (TED2009)
  9. 9. The 5 stars of open linked data ★ Pu`ng anything up there ★★ Machine readable format ★★★ Non-­‐proprietary format ★★★★ Use URLs to iden&fy things ★★★★★ Provide context by linking to others Gov 2.0 Expo 2010: Tim Berners-­‐Lee, "Open, Linked Data for a Global Community” hdps://
  10. 10. RDF triple is the core concept underpinning the seman&c web subject predicate object <hdp://> <hdp://> „John Smith” example:index.html John Smith dc:creator
  11. 11. Several data sources available
  12. 12. Caveat 1: missing central URI reconcilia&on • Responsibility for URIs: hdp:// hdp:// hdp:// hdp:// • Versioning: hdp:// (FMA 3.1) hdp:// (FMA 3.0) hdp:// (Foundry-­‐compliant URI) • Requires insAtuAonal support • RxNorm in RDF?
  13. 13. Caveat 2: data locality hdp://­‐storage-­‐vs-­‐bandwidth-­‐debate/
  14. 14. CONNECTING THE DOTS Given therapeutic action - PPAR gamma partial/ agonist – what were the related compounds studied, the indications for treatment, technologies of drug delivery, related genes and affected pathways?
  15. 15. EBI RDF Plasorm • All model elements with annota&ons to acetylcholine-­‐ gated channel complex (GO:0005892) • Samples treated with alcohol • Find drug-­‐like (but currently not approved) molecules which bind 7TM1 GPCRs with high affinity • Under what experimental condi&ons is Ensembl gene ENSG00000129991 (TNNI3) expressed? • Pathways that reference Insulin (P01308) • What are the preferred gene name and disease annota&ons of all human UniProt entries that are known to be involved in a disease? ★★★★★
  16. 16. ★★★★★ Open PHACTS Discovery Plasorm Freely available, pharmacological data from a variety of resources + tools and services to support pharmacological research
  17. 17. ★★★★★ Bio2RDF: Linked Data for the Life Sciences • ~11 billion triples across 35 datasets • Datasets include:, dbSNP, GenAge, GenDR, LSR, OrphaNet, PubMed, SIDER, WormBase • Locally hosted endpoints: chembl, linkedSPL, pathwaycommons, reactome, wikipathways
  18. 18. NCBO BioPortal RDF • Provide RDF for each class in BioPortal so that we can have a URL to a concept that resolves to a set of RDF triples that provide essen&al informa&on about the term • Provide an RDF dump of each ontology in BioPortal to put them in a tripelstore to enable SPARQL access to the ontologies ★★★★★
  19. 19. ★★★★★ Linked Structured Product Labels hdp:// • LinkedSPLs publishes all sec&ons of FDA-­‐approved prescrip&on and over the counter drug package inserts from DailyMed for use by NLP and Seman&c Web researchers • All ac&ve moie&es and product labels are mapped to RxNORM PURLs provided by the NCBO Bioportal SPARQL endpoint • LinkedSPLs is provided as a service as part of the Drug Interac&on Knowledge Base (DIKB) project Boyce RD et al. Dynamic enhancement of drug product labels to support drug safety, efficacy, and effecLveness. J Biomed SemanLcs. 2013 Jan 26;4(1):5. PMID: 23351881.
  20. 20. Making public FDA datasets more accessible • Adverse events. ★★★★ FDA’s publically available drug adverse event and medica&on error reports, and medical device adverse event reports. • Recalls. Enforcement report data, containing informa&on gathered from public no&ces about certain recalls of FDA-­‐regulated products. • Labeling. Structured Product Labeling (SPL) data for FDA-­‐regulated human prescrip&on drug, OTC drug and biological product labeling.
  21. 21. RDF Representa&on of CDISC Founda&onal Standards • PhUSE and CDISC Draz RDF Representa&on • RDF could provide a founda&on for interoperable end to end data standards in clinical research • hdp://­‐org/
  22. 22. Thank You