Your SlideShare is downloading. ×
0
Which Drug Did You Mean ?
Which Drug Did You Mean ?
Which Drug Did You Mean ?
Which Drug Did You Mean ?
Which Drug Did You Mean ?
Which Drug Did You Mean ?
Which Drug Did You Mean ?
Which Drug Did You Mean ?
Which Drug Did You Mean ?
Which Drug Did You Mean ?
Which Drug Did You Mean ?
Which Drug Did You Mean ?
Which Drug Did You Mean ?
Which Drug Did You Mean ?
Which Drug Did You Mean ?
Which Drug Did You Mean ?
Which Drug Did You Mean ?
Which Drug Did You Mean ?
Which Drug Did You Mean ?
Which Drug Did You Mean ?
Which Drug Did You Mean ?
Which Drug Did You Mean ?
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Which Drug Did You Mean ?

1,096

Published on

BioIT workshop 2012

BioIT workshop 2012

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,096
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
16
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  1. Which Drug Did You Mean?Resolving the linkage spaghetti betweensemantic names, structures, bioactivity and mixtures Christopher Southan ChrisDS Consulting, Göteborg, Sweden, Prepared for BioIT, Boston, April 2012, Track 14, Tuesday See also http://cdsouthan.blogspot.se/2012/ 06/will-real-bosinhib-please-stand- up-take.html [1]
  2. History of Drug Names Approximate timelines[cpd registration system structure and ID------------------------------------------------------------] [patent IUPAC or image--------------------------------------------------------------------] [internal code name(s) externally blinded-------] [code name(s) > structure declared externally -----] [journal papers -----------------------------------------------------------------------] [International Non-proprietary name INN] [INN indexed in MeSH-----------------] [USAN, BAN, JAN --------------------] [brand name(s)-------------------] [combination brand ] [2]
  3. History of Atorvastatin• 1985: (3R,5R)-7-[2-(4-fluorophenyl)-3-phenyl-4-(phenylcarbamoyl)-5-(propan-2-yl)-1H- pyrrol-1-yl]-3,5-dihydroxyheptanoic acid IUPAC• ~ 1987: Park-Davis internal code number CI-981• ~ 1995: Atorvastatin [INN:BAN] Atorvastatin calcium [USAN], Atorvastatin calcium trihydrate INN (error ?) Atorvastatina (Spain)• 1997 Lipitor (brand name) Faboxim (Argentina) Zurinel (Chile) etc• 2004: Caduet (brand name) Norvasc (amlodipine besylate) and Lipitor(atorvastatin calcium)• 2012: atorvastatin calcium – generic - Ranbaxy• 2012: amlodipine besylate and atorvastatin calcium – generic - Ranbaxy [3]
  4. Causes of Drug Linkage Spaghetti (I)• Tautomer/stereo mutiplexing and structure interconversion differences (e.g. complex antibiotics)• Popular structures > 100s of submitters > many vendors > more noise• Opaque ecosystem of primary submitters, secondary linkers, declared circularity, cryptic circularity, and submitters having independent portals with different rules• Older drugs accumulate 100’s of synonyms and database x-refs, with erros• Accumulated wet assay results are dependent on how long the drug has been in which public screening collection• Deprecated structures not always refreshed between databases globally• Pro-drugs, metabolites or tested combinations rarely have explicit x-refs [4]
  5. Causes of Drug Linkage Spaghetti (II)• Literature extractions flowing into drug databases (including MeSH) can have – Author errors and paucity of standards in the primary report – No quality filtration at the result level – Curation errors and different annotation rules – No discrimination of independent de-novo checking from annotation recycling• Large-scale patent extraction feeds into databases bring in – Forests of analogues with no data links – High redundency for drugs and leads – Structural differences between pipeline outputs – Opportunistic permutations of salts and mixtures – Opportunistic virtual deuteration of all best-selling drugs• Drug discovery operations use many drugs as reference compounds in their internal screening collections . This means – Name > structure cross-mapping, internal, public and commercial – Integration of internal and external data across the same drugs [5]
  6. Atorvastatin• The scale of links provides a good cross section of problems• Relationship cross-mappings and the PubChem tool-box facilitate navigation through the links• External submissons get a substance ID (SID) which are merged to compound records (CID) vi chemistry rules (see PubChem documentation)• This drug has accumulated years of submissions from different sources, BioAssay entries and pharmacology literature links• The parent CID 60823 has – 99 synonyms – 6 stero forms – 70 cannonicaly-related structures – 449 substance records [6]
  7. What is Atorvastatin ? - for Patients [7]
  8. Atorvastatin - for InformaticiansPubChem CID 60823 PubChem submissions include:Wikepedia (3R,5R) CID 60823 (5R) CID 51052072ChemSpider 54810 (3R) CID 21029434 (3S,5R) CID 6093359 (3S,5S) CID 62976DrugBank APRD00055 No stereo CID 2250 Query: Same, Isotopes forCHEMBL1487 PubChem Compound (Select 60823)CAS 134523-00-5 [8]
  9. Name Retrieval Specificity (I) [9]
  10. Name Retrieval Specificity (II)”atorvastin” in DailyMed link not synonyms [10]
  11. Drug BioAssay Data: Splitting bySubmitted Structure Differences Mainly uHTS and counterscreens from Scripps & Burnham AIDs 406848-53 in ChEMBL – (antimalarial assay specified salt) ChEMBL Antimalarial strain assays (also specified salt), in vivo plus three target links Mainly qHTS from NCGC, no hits [11]
  12. Pharmacological Activity in vivo is ~70% Active Metabolites i.e. not AtorvastatinHazardous Substances DataBank x-ref in the CID, but nodirect links to the metabolites(yet). Only one in-vitro assay CID 9851106result for 9808225 CID 60823 CID 9808225 [12]
  13. Salt Confusion (I) Atorvastatin Calcium FDA packegeCID 656846 Mw 1209 insert lable,CAS 344423-98-9 hemicalcium trihydrateCID 60822 Mw 1155CAS 134523-03-8 INN = atorvastatin USAN/BAN = atorvastatinCID 11227182 Mw 598 calcium [13]
  14. Salt Confusion (II): What gets to PatientsCID 656846CID 53252956CID 23665101 No INNs, USANs or clinical trials entries for these salts [14]
  15. Mixtures: Problematic all Round• Atorvastatin parent (CID 60823) has 379 mixture SIDs and 147 mixture CIDs permuatated from 122 component CIDs• Of the 122 components 58 have a MeSH pharmacology tag, 92 have BioAssays results, 70 are in DrugBank, 101 are in ChEMBL, and 47 are below 200 mw (and thus probably salts not drugs)• Of the 147 mixture CIDs, only the 2 atorvastatin dimers have assay results or pharmacology so none of the drug mixtures have direct data links• None are in DrugBank CIDs and only atorvastin calcium is in ChEMBL• 138 of the 147 have been extracted from patents by Derwent/Thomson and are unlikely to get data links• The small number of important drug combinations that do have data and/or trial results are difficult to identify• Tested drug mixtures rarely get public code names, some get trade names but never INNs• Chemistry rules may split mixtures and synonyms in databases• PubMed "Drug Combinations"[MeSH Term] = 54,186 but no SID or CID links• Mixture components can be designated with space, / , + or ”co” [15]
  16. The Famous Polypill: A Fuzzy term CID 44602839 Thomson Pharma 18 clinicaltrials.gov entries, but only partial component linksaspirin 81 mg, enalapril 2.5 mg, atorvastatin 20 mg and hydrochlorothiazide 12.5 mg(polypill) PMID: 21647425: Australian New Zealand Clinical Trials RegistryACTRN12607000099426DrugBank and TTD negative [16]
  17. Caduet: an Approved CombinationDrugbank Wikipediahttp://clinicaltrials.gov/ct2/show/NCT01107743 [17]
  18. Submitter Synonym Noise in PubChem [18]
  19. A more Recent Combination But, QA149 is negative in PubChem, DrugBank and TTD [19]
  20. Spaghetti is Resolvable but Errors are Tough: Will the Real LX4211 Please Stand up ? http://cenblog.org/the-haystack/2012/03/liveblogging-first-time-disclosures-from-acssandiego/See also: http://cdsouthan.blogspot.se/2012/03/live-chemical-structure-blogging-but.html [20]
  21. Summary• You can navigate the linkage spaghetti in name, synonym, structure bioactivity and mixture space, but this needs perspicacity and circumspection.• The current drug information ecosystem with multiple stakeholders seems destined to remain ”fuzzy”• Beyond informatics challenges the consequences, particularly from frank errors, could be more serious• WHO INNs and naming stems play a key positive role – but ; – No open athoritative database - only 7000 PDF entries (!) – No transparent coordination between USAN, FDA, MeSH, national offices, or clinical trials registries – Susceptable to commercial flanking tactics• Drug combinations have a bright pharmacological future but a difficult informatics one• The fuzz includes scientific challenges (e.g. complex strucutures, dynamic tautomerism, active metabolites, formulation differences, paucity of standardised and comparable activity data.• Efforts are being made to improve the situation, including from the databases represented in this Workshop session. [21]
  22. Questions WelcomeChrisDS Consulting: http://www.cdsouthan.info/Consult/CDS_cons.htmMobile: +46(0)702-530710, Skype: cdsouthanEmail: cdsouthan@hotmail.comTwitter: http://twitter.com/#!/cdsouthanBlog: http://cdsouthan.blogspot.com/LinkedIN: http://www.linkedin.com/in/cdsouthanWebsite: http://www.cdsouthan.info/CDS_prof.htmPublications: http://www.citeulike.org/user/cdsouthan/publications/order/yearCitations: http://scholar.google.com/citations?user=y1DsHJ8AAAAJ&hl=enPresentations: http://www.slideshare.net/cdsouthanFYI : A short piece on identifying the names and molecular details ofdrugs in clinicaltrials.govhttp://www.samedanltd.com/magazine/13/issue/166/article/3152 [22]

×