An introduction to the Linked Structured Product Label (LinkedSPL) resource for the W3C Health Care and Life Sciences Linking Open Drug Data task force. LinkeSPLs ( publishes selected sections of FDA-approved drug package inserts from DailyMed for use by NLP and Semantic Web researchers. Currently, only data from the product labels of prescription drugs is provided. This site's SPL data is updated weekly and all SPLs retain DailyMed versioning data so that researchers can record the provenance of the text and sections they work with. LinkedSPLs is provided as a service as part of the Drug Interaction Knowledge Base (DIKB) project

  1. 1. Linked Structured Product Labels (LinkedSPLs) - Towards a reference linked data source for drug product labeling Richard Boyce, University of Pittsburgh 1 Biomedical InformaticsDepartment of Biomedical Informatics
  2. 2. Overview• About product labels• What are Structured Product Labels• Some use cases• Linked DailyMed was a great start• How LinkedSPLs is an improvement• Some best practices used and needed• Discussion 2 Biomedical Informatics
  3. 3. Why are product labels important? • Product labels are intended to be a reference for prescribing clinicians [1]1. Marroum PJ, Gobburu J. The product label: how pharmacokinetics and pharmacodynamics reach the prescriber. Clin Pharmacokinet. 2002;41(3):161-169.2. Ko, Y. et al. Prescribers’ knowledge of and sources of information for potential drug- drug interactions: a postal survey of US prescribers. Drug Saf. 31, 525-536 (2008).3. Boyce RD, Collins C, Clayton M, Kloke J, Horn J. Inhibitory Metabolic Drug Interactions with Newer Psychotropic Drugs: Inclusion in Package Inserts and Influences of Concurrence in Drug Interaction Screening Software. Annals of Pharmacotherapy. (In Press). 3 Biomedical Informatics
  4. 4. Why is the product label apparentlyso influential?• Authoritative• Simple (for a drug expert) to follow• Often the only source of information besides FDA approval documentation for new drugs• No standard for searching the scientific literature• No standard for judging the quality of published studies 4 Biomedical Informatics
  5. 5. Structured Product Labels (SPL)• All package inserts for currently marketed drugs are available in this format [1-3] 1. 2. 3. 5 Biomedical Informatics
  6. 6. Want to see one?• Perphenazine and Amitriptyline HCL: – 6 Biomedical Informatics
  7. 7. FDA dictates the kinds of claims that should be present in each section [1]1. FDA. Code of Federal Regulations 21 Part 201.56, chapter Requirements on content and format of labeling for human prescription drug and biological products. Washington, DC: US Government Printing Office, 2010. 7 Biomedical Informatics
  8. 8. DailyMed – the public source ofSPLs•• Provides an HTML view of SPLs using XSLT• Current statistics: – Human prescription products: ~17K – Human OTC: ~16K – Homeopathic: ~3K – Animal: 1k – Other: ~500 8 Biomedical Informatics
  9. 9. DailyMed uses Permanent URLs"SPL IDs are not static, so a labels URL may change ifthe label is updated. However, we provide permanentURLs to view or download the latest version of an SPL”• a permanent URL to view a label. –{setId}• a permanent URL to download a label as a ZIP file. – d={setid}• Try it for setid ‘ca5598e4-4226-45ab-abd1-e961707ae457’ 9 Biomedical Informatics
  10. 10. SPLs on the Semantic Web 1/2• The DailyMed node: – – 164,276 triples; 4,039 drugs (• A great start but… – does not appear to handle drug products with more than one active ingredient correctly – clearly incomplete – incorrect encoding (Latin instead of Unicode) – how often is it updated and where is the versioning information? – where are the HTML tables and Image tags? 10 Biomedical Informatics
  11. 11. SPLs on the Semantic Web 2/2• The LinkedSPL node: – – 527K triples; 17K drugs – Active ingredients linked to ChEBI (via dc:subject) • Also, links to DrugBank (706) and bio2rdf (1412) via dailymed:subjectXref• Some improvements now implemented – correctly handles drug products with more than one active ingredient – Complete for prescription drugs – correct encoding – updated weekly and versioning information present – HTML tables and Image tags retained in raw form 11 Biomedical Informatics
  12. 12. Goals for LinkedSPLs• I want this the resource to: 1. be a model of best practices for publishing linked open drug data in terms of provenance, data quality, and timeliness 2. Become the reference linked data source for drug product labels for the NLP and Semantic Web communities 12 Biomedical Informatics
  13. 13. Use Case 11. An NLP researcher developing an algorithm for extracting entities or knowledge from the product label – May require only a particular section or set of sections – Would like to stratify sampling of the product labels by product and drug features • E.g., drug class, drug targets, pharmacokinetic properties, manufacturer, date of first release – The training data represents knowledge that is most useful if linked back to the SPL 13 Biomedical Informatics
  14. 14. Use Case 21. A Semantic Web researcher wants to link to product labeling – May require only a particular section or set of sections – Wants to enable querying of the product labels by product and drug features • E.g., drug class, drug targets, pharmacokinetic properties, manufacturer, date of first release – Data quality and provenance important 14 Biomedical Informatics
  15. 15. Some best practices used andneeded• PURLs not yet fully implemented – Makes for long URIs and server dependance• Some initial use of provenance meta-data – See next slide• Revision history not yet implemented – Can be derived from DailyMed RSS feed• Not sure of all of the NLP and SW community requirements• Looking for collaborators 15 Biomedical Informatics
  16. 16. My initial effort at provenanced2r:documentMetadata [ foaf:primaryTopic <>; meta:d2rUser "Richard D. Boyce"; meta:d2rUserHomepage <>; meta:d2rOperator "Richard D. Boyce"; meta:license <>; meta:dataset <>; prv:performedAt "2012-04-10T12:00:00Z"^^xsd:dateTime; prv:performedBy <>; rdfs:comment "Please note that DailyMed updates( are brought into theLinkedSPL resources once per week."; 16 Biomedical Informatics
  17. 17. Discussion/questions 17 Biomedical Informatics
  18. 18. Acknowledgements• The Drug Interaction Knowledge Base team – John Horn Pharm.D, Carol Collins MD, Greg Gardner, Rob Guzman• W3C LODD and Scientific Discourse Task Force 18 Biomedical Informatics