Comparing ChEMBL, DrugBank, Human
Metabolome db and Therapeutic Target db at the
chemistry and protein levels:

The presen...
[2]
So why compare databases?
•
•
•
•
•
•
•
•
•
•
•

Determine what they actually contain
Determine extraction selectivity
Ret...
Introducing the resources (I)
•

~ 2/3 of ChEMBL is curated from medicinal chemistry papers, mainly
as structure-activity-...
Introducing the resources (II)

•

HMDB collates detailed chemical, clinical and biochemical data on
human metabolites. Th...
Chemistry comparisons

[6]
Chemistry Mw vs
over time

[7]
Resolving content inside PubChem (slice ‘n dice)

ChEMBL
ChEMBL

HMDB

DrugBank

TTD
[8]
Comparative protein attributes (e.g. GO)

HMDB

IUPHARdb

[9]
Comparing UniProt cross-references

[10]
Compare individual curatorial errors

[11]
Protein ID comparisons

Consensi are corroborative (but beware of curatorial circularity)
Human Swiss-Prot IDs

[12]
Differences in curation rules (e.g. atorvastatin)

[13]
Check for false-negatives

[14]
Thanks, questions welcome

Our 2012 paper (but the data was 2010) Mapping between databases of
compounds and protein targe...
Upcoming SlideShare
Loading in …5
×

Comparing ChEMBL, DrugBank, Human Metabolome db and Therapeutic Target db at the chemistry and protein levels

1,360 views

Published on

Talk given at the Paris NC-IUPHAR meeting, Paris, October 2013

ChEMBL, DrugBank, Human Metabolome Database and the Therapeutic Target Database are resources of curated chemistry-to-protein relationships widely used in the chemogenomic arena. In this work we have extended an earlier analysis (PMID 22821596) by comparing chemistry and protein target content between 2010 and 2013. For the former, details are presented for overlaps and differences, statistics of stereochemistry as well as stereo representation and MW profiles between the four databases. For 2013 our results indicate quality improvements, major expansion, increased achiral structures and changes in MW distributions. An orthogonal comparison of chemical content with different sources inside PubChem highlights further interpretable differences. Expansion of protein content by UniProt IDs is also recorded for 2013 and Gene Ontology comparisons for human-only sets indicate differences. These emphasise the expanding complementarity of chemistry-to-protein relationships between sources, although different criteria are used for their capture.

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,360
On SlideShare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
17
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Comparing ChEMBL, DrugBank, Human Metabolome db and Therapeutic Target db at the chemistry and protein levels

  1. 1. Comparing ChEMBL, DrugBank, Human Metabolome db and Therapeutic Target db at the chemistry and protein levels: The presentation is based on this paper http://onlinelibrary.wiley.com/doi/10.1002/minf.201300103/abstract Chris Southan, Curator for IUPHAR-db/Guide toPHARMACOLOGY Presented at the IUPHAR Meeting, Paris, October 2013 [1]
  2. 2. [2]
  3. 3. So why compare databases? • • • • • • • • • • • Determine what they actually contain Determine extraction selectivity Retro-divine the curatorial rules Assess relationship mapping stringency and fidelity Understand entity, attribute and relationship distributions Become aware of declared or cryptic circularity Detect global error propagation Evaluate unique content Judge utility, complementarity, consumability and integratablity See what to avoid in your own database Know what to emulate [3]
  4. 4. Introducing the resources (I) • ~ 2/3 of ChEMBL is curated from medicinal chemistry papers, mainly as structure-activity-relationship (SAR) results. The other ~ 1/3 is from confirmatory PubChem BioAssays. Release 15 (January 2013) 9,570 targets, 1,254,575 distinct compounds, 10,509,572 activities and 48,735 publications • DrugBank collates target and mechanism-of-action information. Version 3.0 (January 2011) contains 6,715 drug entries including 1,452 FDAapproved small molecules, 131 biologicals, 86 nutraceuticals and 5,076 experimental compounds. These are mapped to 4,233 protein IDs. • TTD is conceptually similar to DrugBank but the compound-to-target mappings are focussed on primary targets. It ncludes a three-way split of targets and compounds into marketed, clinical trial and research phase. The latest version 4.3.02 (August 2011) includes 2,025 targets, 17,816 chemical structures, including 1,540 approved drugs. [4]
  5. 5. Introducing the resources (II) • HMDB collates detailed chemical, clinical and biochemical data on human metabolites. These are linked to other databases including enzymes involved in the transformations. Version 3.0 (September 2012) contains 40,437 chemical entries and 5,650 protein sequence identifiers. • IUPHARdb/GTP (you are hearing about this weekend….) 6064 ligands, 559 approved drugs , 894 Unique (UniProt) targets with direct activity mappings (all ligand types, all species, but excluding kinase screens) 21,774 references [5]
  6. 6. Chemistry comparisons [6]
  7. 7. Chemistry Mw vs over time [7]
  8. 8. Resolving content inside PubChem (slice ‘n dice) ChEMBL ChEMBL HMDB DrugBank TTD [8]
  9. 9. Comparative protein attributes (e.g. GO) HMDB IUPHARdb [9]
  10. 10. Comparing UniProt cross-references [10]
  11. 11. Compare individual curatorial errors [11]
  12. 12. Protein ID comparisons Consensi are corroborative (but beware of curatorial circularity) Human Swiss-Prot IDs [12]
  13. 13. Differences in curation rules (e.g. atorvastatin) [13]
  14. 14. Check for false-negatives [14]
  15. 15. Thanks, questions welcome Our 2012 paper (but the data was 2010) Mapping between databases of compounds and protein targets Muresan S, Sitzmann M, Southan C. Methods Mol Biol. 2012;910, PMID:22821596 Now on http://figshare.com/articles/Mapping_Between_Databases_of_Compounds_and_Pr otein_Targets/818979 If enjoyed this presentation, you might also like PMID: 20298516 [15]

×