SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.
SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.
Successfully reported this slideshow.
Activate your 14 day free trial to unlock unlimited reading.
1.
Challenges of curating approved medicines:
Will the real drugs please stand up?
Chris Southan, representing the Database Team
NC-IUPHAR/BPS/GTPdb Biannual Meeting, Paris, October 2014
1
2.
What is the total for approved drug structures?
Take your pick …..
2
4.
Explanations
• Discordance: distinctly different drug molecular representations from
different sources that we would recognise canonically as the same
bioactive substance
• These are merged into multiple CIDs per drug (i.e. “multiplexed”) via the
PubChem chemistry rules due to:
– Permutation of R/S stereo centers
– Salt forms
– Mixtures
– Unresolved E/Z bonds
– Tautomers
– Isotopic derivatives including deuteration
4
5.
Causes of drug structure multiplexing
• Inherent challenges and complexities of chemical representation
• Utility of PubChem depends on advanced rules applied to a submission-based
system
• Drug companies never verify their own structures in public databases
• Legacy of structure image primacy in documents
• No clear accountability for correctness of public approved drug structures
(companies? FDA? WHO(INN)? AMA(USAN)? Wikipedia? CAS?)
• Structural variants enter databases from general source proliferation,
large-scale patent extractions, chemical vendor submissions and
repeated exemplifications in journals
• The net effect is an inexorable increase in multiplexing but not necessarily
erroneous structures per se
5
10.
Reading the links for alternative taxols:
different structures > 20 sets of assay results
10
11.
Virtual deuteration: compounding drug multiplexing
11
12.
Scale of the issue for approved drugs in PubChem:
multiplexing expansion from 2005 to 2014
12
13.
So how are we doing in our database?
• Sets were salt-stripped for this comparison
• GTPdb (Oct 2014) has 983 approved drug CIDs concordant with either
ChEMBL or DrugBank
• But only 723 are 4-way concordant
• We will inspect the 152, 192 and 180 sectors for consensus expansion
13
14.
Consequences and possible solutions to the
drug multiplexing issue
• Our drugs annotation Committee cannot magic these issues away
but their support is crucial
• Our consensus approach is useful and statistical defendable
• In the GTPdb we add curator comments and cross-pointers for key
multiplexed examples
• Sources that make the effort to collate drug structure sets should
cross-corroborate more
• A canonical approach to merging drug structure-to-bioactivity
mappings could be considered
• The inner connectivity layer of the InChIKey goes some way towards
this
14
0 likes
Be the first to like this
Views
Total views
975
On SlideShare
0
From Embeds
0
Number of Embeds
7
You have now unlocked unlimited access to 20M+ documents!
Unlimited Reading
Learn faster and smarter from top experts
Unlimited Downloading
Download to take your learnings offline and on the go
You also get free access to Scribd!
Instant access to millions of ebooks, audiobooks, magazines, podcasts and more.
Read and listen offline with any device.
Free access to premium services like Tuneln, Mubi and more.