NCI/CADD: Open-access chemical structure web platform  Markus Sitzmann 1 , Wolf-Dietrich Ihlenfeldt 2 ,  and Marc C. Nickl...
NCI/CADD Public Web Services  Enhanced NCI Database Browser http://cactus.nci.nih.gov/ncidb2 web service for NCI/DTP’s Ope...
NCI/CADD Public Web Services  OSRA   http://cactus.nci.nih.gov/osra/ converts graphical representations of chemical struct...
http://cactus.nci.nih.gov
New Web Services
Chemical Structure Representations chemical structure NCI/CADD Identifiers InChI/InChIKey ChemSpider ID PubChem SID/CID ch...
http://cactus.nci.nih.gov/chemical/structure Works as a resolver for different  chemical structure identifiers.  Allows on...
http://cactus.nci.nih.gov/chemical/structure first beta release:  July 2009 second beta release:  Nov. 2009 third beta rel...
<ul><li>it is usable by a simple URL API: </li></ul>example:   http://cactus.nci.nih.gov/chemical/structure/ Tamiflu / cas...
identifier representation http request http response detection of the identifier type identifier is a full structure  repr...
“Chemical Structure Web Engine” Chemical Structure Web Engine  NCI/CADD web service NCI/CADD web service NCI/CADD Chemical...
<ul><li>number of structure records:  103.9 million </li></ul><ul><li>number of unique structures: </li></ul><ul><ul><li>S...
<ul><li>ChemNavigator iResearch Library compilation of commercially available screening compounds from ~300 inter- nationa...
<ul><li>based on hashcodes calculated by the chemoinformatics toolkit CACTVS </li></ul><ul><li>CACTVS hashcodes:   </li></...
charged form A3DAE0788050DDE4  3ECEF579D7DF025A tautomers isotope “ errors” E92E4BA2869F3611 8A7AD1EB498CC76A stereoisomer...
input structure MDL Molfile MDL SDF SMILES ChemDraw cdx PDB structure normalization parent structure MDL SDF SMILES databa...
<ul><li>adjustable levels of sensitivity: </li></ul>NCI/CADD Structure Identifiers Fragments sensitive keep only largest o...
NCI/CADD Structure Identifiers Fragments Isotopes Charges sensitive sensitive sensitive un-sensitive un-sensitive un-sensi...
NCI/CADD Structure Identifiers Fragments Isotopes Charges sensitive sensitive sensitive F I C FICTS identifier:   represen...
NCI/CADD Structure Identifiers Fragments Isotopes Charges sensitive sensitive sensitive F I C FICuS identifier:  comes clo...
NCI/CADD Structure Identifier Fragments Isotopes Charges Tautomers Stereochemistry Na + sensitive sensitive sensitive sens...
A3DAE0788050DDE4-FICTS  E5F83F10C5DB080A -FICTS B2FDA68AEDA06DB9-FICTS 9850FD9F9E2B4E25 -FICTS E5F83F10C5DB080A -FICTS E92...
A3DAE0788050DDE4-FICuS  E5F83F10C5DB080A -FICuS B2FDA68AEDA06DB9-FICuS 9850FD9F9E2B4E25 -FICuS E5F83F10C5DB080A -FICuS E92...
9850FD9F9E2B4E25 -uuuuu 9850FD9F9E2B4E25 -uuuuu 9850FD9F9E2B4E25 -uuuuu 9850FD9F9E2B4E25 -FICuS 9850FD9F9E2B4E25 -uuuuu 98...
NCI/CADD Chemical Structure Database NCI/CADD:RID NCI/CADD:CID structure records compounds (structures unique by CACTVS HA...
resolver chemical names CAS numbers SMILES strings IUPAC InChI/InChIKeys NCI/CADD Identifiers CACTVS HASHISY NSC number Pu...
http://cactus.nci.nih.gov/chemical/structure/ LFQSCWFLJHTTHZ-UHFFFAOYSA-N / smiles Standard InChIKey Chemical Identifier R...
alc  Alchemy format cdxml  CambridgeSoft ChemDraw XML format cerius  MSI Cerius II format charmm   Chemistry at HARvard Ma...
http://cactus.nci.nih.gov/chemical/structure/ buckyball / image ? height= 300 &width= 300 &bgcolor= black &bondcolor= whit...
TwirlyMol Chemical Identifier Resolver implemented by Noel O'Boyle (University College Cork, Ireland) Chrome  Safari   FF3...
<ul><li>simple viewer: </li></ul>http://cactus.nci.nih.gov/chemical/structure/ restasis / twirl <ul><li>embed into a web p...
restasis
http://www.coronene.com/blog/ http://chemical-quantum-images.blogspot.com http://baoilleach.blogspot.com/  TwirlyMol Chemi...
ethanol name a specific resolver module : http://cactus.nci.nih.gov/chemical/structure/ CCO / iupac_name ?resolver= name 2...
< ?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot; ? >   < request   string=&quot; CCO &quot;   representation=“ iu...
< ?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot; ? >   < request   string=&quot; restasis &quot; representation=&...
<ul><li>get available names: </li></ul>http://cactus.nci.nih.gov/chemical/structure/ CC (= O)Oc1ccccc1C(O)=O/ names /xml  ...
http://cactus.nci.nih.gov/blog /chemical/structure Blog
In Development http://cactus.nci.nih.gov/ TEST_ chemical/structure
<ul><li>manipulates the structure created from the identifier </li></ul><ul><li>new representation is calculated after str...
Tautomers “ Chemical Operator” http://cactus.nci.nih.gov/chemical/structure/ tautomers :guanine /” representation ” N N H ...
<ul><li>(hopefully) there will be many resolvers from different providers with different background: </li></ul><ul><ul><li...
IUPAC InChI/InChIKey Resolver  IUPAC Root Resolver Resolver 1 Resolver 2 Resolver 3 Resolver 3.1 Resolver 3.2 Resolver 3.3...
http://cactus.nci.nih.gov/chemical/structure Chemical Identifier Resolver NCI/CADD Web Resources http://cactus.nci.nih.gov...
Acknowledgments ChemNavigator Scott Hutton Tad Hurst CADD Group, CBL, NCI Igor Filippov  Noel O'Boyle Hans-Juergen Himmler...
Users webel.py - A Cinfony module IUPHAR DATABASE http://www.iuphar-db.org http://baoilleach.blogspot.com/2009/11/introduc...
Upcoming SlideShare
Loading in...5
×

ACS San Francisco 2010 CINF Talk

1,481

Published on

0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,481
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
34
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

ACS San Francisco 2010 CINF Talk

  1. 1. NCI/CADD: Open-access chemical structure web platform Markus Sitzmann 1 , Wolf-Dietrich Ihlenfeldt 2 , and Marc C. Nicklaus 1 [1] Computer-Aided Drug Design Group, Chemical Biology Laboratory, NCI-Frederick, NIH, DHHS [2] Xemistry GmbH, Auf den Stieden 8, D-35094 Lahntal, Germany
  2. 2. NCI/CADD Public Web Services Enhanced NCI Database Browser http://cactus.nci.nih.gov/ncidb2 web service for NCI/DTP’s Open NCI Database <ul><li>first release 1998, updated 2001 </li></ul><ul><li>~250,000 structure records </li></ul><ul><li>~60 million data points </li></ul>Chemical Structure Lookup Service http://cactus.nci.nih.gov/lookup <ul><li>first release 2006, updated 2008 </li></ul><ul><li>~74 million structure records (~46 million unique structures) </li></ul>structure lookup in over 100 database
  3. 3. NCI/CADD Public Web Services OSRA http://cactus.nci.nih.gov/osra/ converts graphical representations of chemical structures in journal articles, patent documents, textbooks, trade magazines etc., into SMILES Online SMILES Translator http://cactus.nci.nih.gov/translate/ GIF Creator for Chemical Structures http://cactus.nci.nih.gov/gifcreator/ PROSIT: Online Pseudorotation Tool Version 2 http://cactus.nci.nih.gov/prosit/
  4. 4. http://cactus.nci.nih.gov
  5. 5. New Web Services
  6. 6. Chemical Structure Representations chemical structure NCI/CADD Identifiers InChI/InChIKey ChemSpider ID PubChem SID/CID chemical names CAS Registry Number NSC number FDA UNII ChemNavigator SID SMILES SD File Chemical Formula ChEBI ID PDB Ligand ID MRV CML SYBYL Line Notation GIF image
  7. 7. http://cactus.nci.nih.gov/chemical/structure Works as a resolver for different chemical structure identifiers. Allows one to convert a given structure identifier into another representation or structure identifier. Chemical Identifier Resolver NCI/CADD Web Resources
  8. 8. http://cactus.nci.nih.gov/chemical/structure first beta release: July 2009 second beta release: Nov. 2009 third beta release: April/May 2010 (beta versions will continue through 2010) 3.0 million requests since July 1, 2009 (~11.000/day) Chemical Identifier Resolver NCI/CADD Web Resources
  9. 9. <ul><li>it is usable by a simple URL API: </li></ul>example: http://cactus.nci.nih.gov/chemical/structure/ Tamiflu / cas 204255-11-8 http://cactus.nci.nih.gov/chemical/structure/”identifier”/”representation” MIME type: text/plain Chemical Identifier Resolver NCI/CADD Web Resources XML format: http://cactus.nci.nih.gov/chemical/structure/”identifier”/”representation” /xml <ul><li>if a request is not resolvable: HTTP404 status message </li></ul>
  10. 10. identifier representation http request http response detection of the identifier type identifier is a full structure representation (e.g. SMILES, InChI) calculation of the requested structure representation identifier is a hashed structure representation (e.g. InChIKey), chemical name etc. database lookup MIME type Chemical Identifier Resolver NCI/CADD Web Resources structure e.g. InChI, GIF image e.g. CAS number, chemical name
  11. 11. “Chemical Structure Web Engine” Chemical Structure Web Engine NCI/CADD web service NCI/CADD web service NCI/CADD Chemical Structure Database (CSDB) CACTVS external web services http Chemical Identifier Resolver other software packages
  12. 12. <ul><li>number of structure records: 103.9 million </li></ul><ul><li>number of unique structures: </li></ul><ul><ul><li>Std. InChIKey : ~73.0 million </li></ul></ul><ul><ul><li>FICuS : ~70.6 million </li></ul></ul><ul><ul><li>uuuuu : ~65.3 million </li></ul></ul><ul><li>from the set of ~83.6 million unique structures we have derived about ~10 million additional scaffold-type structures (for future structure searches); thus: </li></ul><ul><li>for lookup “ identifier  structure ” available: </li></ul><ul><ul><li>~92.9 million Standard InChIKeys </li></ul></ul><ul><ul><li>~93.3 million NCI/CADD Identifiers </li></ul></ul><ul><ul><li>~70 million chemical names linked to ~16 million structures </li></ul></ul>} union set of unique structures: ~83.6 million Chemical Structure Database NCI/CADD Web Resources
  13. 13. <ul><li>ChemNavigator iResearch Library compilation of commercially available screening compounds from ~300 inter- national chemistry suppliers </li></ul><ul><li>PubChem database including Open NCI database, EPA DSSTox databases, NIAID HIV databases, NIST Webbook, NLM ChemIDplus, ChemSpider … </li></ul><ul><li>Commercial Sources / others Asinex, Comgenex, … </li></ul>as of March 2010: 140 chemical structure databases 103.9 million structure records ~70.6 million unique structures by FICuS ChemNav. iResearch Lib. ~56% PubChem ~38% others ~6% Chemical Structure Database NCI/CADD Web Resources
  14. 14. <ul><li>based on hashcodes calculated by the chemoinformatics toolkit CACTVS </li></ul><ul><li>CACTVS hashcodes: </li></ul><ul><ul><li>represent a chemical structure uniquely as 16-digit hexadecimal number (64-bit unsigned) </li></ul></ul><ul><ul><li>have a high sensitivity to structural features of a compound </li></ul></ul><ul><ul><li>change if connectivity changes </li></ul></ul>NCI/CADD Structure Identifiers Unique Representation of Chemical Structures 9850FD9F9E2B4E25 H N N N H 2 O H O
  15. 15. charged form A3DAE0788050DDE4 3ECEF579D7DF025A tautomers isotope “ errors” E92E4BA2869F3611 8A7AD1EB498CC76A stereoisomers 6C16DE2351F9FF50 salt 9850FD9F9E2B4E25 H N N N H 2 O H O N N H N H 2 O H O H N N O H O N H 2 H N N O H O N H 2 H N N N H 2 O - O N a + H N N N H 3 + O - O 8F7A1DE5A733F0E0 O H N N N H 2 O N a 60525E1AF41497B6 H N N N H O H O B2FDA68AEDA06DB9 N H N 1 5 N H 2 O H O
  16. 16. input structure MDL Molfile MDL SDF SMILES ChemDraw cdx PDB structure normalization parent structure MDL SDF SMILES database NCI/CADD Identifier hashcode calculation NCI/CADD Structure Identifiers Unique Representation of Chemical Structures E_HASHISY
  17. 17. <ul><li>adjustable levels of sensitivity: </li></ul>NCI/CADD Structure Identifiers Fragments sensitive keep only largest organic fragment Isotopes ignore isotope labels sensitive Charges uncharge sensitive find canonical tautomer Stereochemistry sensitive discard stereo information un-sensitive un-sensitive un-sensitive un-sensitive sensitive Tautomers Na + Structure Normalization un-sensitive D D D D D D O O C O O H N H 2 O - O N H 3 + O H O N H 2 O O H O O H C O O H H N H 2 C O O H N H 2 H O O - O O H
  18. 18. NCI/CADD Structure Identifiers Fragments Isotopes Charges sensitive sensitive sensitive un-sensitive un-sensitive un-sensitive un-sensitive Tautomers Stereochemistry sensitive sensitive Na + Structure Normalization D D D D D D O O C O O H N H 2 O - O N H 3 + O H O N H 2 O O H O O H C O O H H N H 2 C O O H N H 2 H O O - O O H
  19. 19. NCI/CADD Structure Identifiers Fragments Isotopes Charges sensitive sensitive sensitive F I C FICTS identifier: representation of the exact drawing un-sensitive un-sensitive un-sensitive un-sensitive un-sensitive T ≠ ≠ ≠ Tautomers Stereochemistry sensitive sensitive ≠ ≠ S Na + = = ≠ ≠ Structure Normalization D D D D D D O O C O O H N H 2 O - O N H 3 + O H O N H 2 O O H O O H C O O H H N H 2 C O O H N H 2 H O O - O O H
  20. 20. NCI/CADD Structure Identifiers Fragments Isotopes Charges sensitive sensitive sensitive F I C FICuS identifier: comes closest to how a chemist perceives a compound un-sensitive un-sensitive un-sensitive un-sensitive un-sensitive u ≠ ≠ ≠ ≠ Tautomers Stereochemistry sensitive sensitive = = ≠ ≠ S Na + Structure Normalization D D D D D D O O C O O H N H 2 O - O N H 3 + O H O N H 2 O O H O O H C O O H H N H 2 C O O H N H 2 H O O - O O H
  21. 21. NCI/CADD Structure Identifier Fragments Isotopes Charges Tautomers Stereochemistry Na + sensitive sensitive sensitive sensitive sensitive = = = = = = = = uuuuu identifier: closely related forms of the same compound u u u u u un-sensitive un-sensitive un-sensitive un-sensitive un-sensitive Structure Normalization O O - D D D D D D O - O N H 3 + O O H O O H C O O H H N H 2 C O O H N H 2 H O O H O O C O O H N H 2 O H O N H 2
  22. 22. A3DAE0788050DDE4-FICTS E5F83F10C5DB080A -FICTS B2FDA68AEDA06DB9-FICTS 9850FD9F9E2B4E25 -FICTS E5F83F10C5DB080A -FICTS E92E4BA2869F3611-FICTS 8A7AD1EB498CC76A-FICTS 6C16DE2351F9FF50-FICTS H N N N H 2 O - O N a + 9850FD9F9E2B4E25 -FICTS charged form tautomers isotope salt stereoisomers FICTS “ errors” H N N N H 2 O H O N N H N H 2 O H O H N N O H O N H 2 H N N O H O N H 2 H N N N H 3 + O - O O H N N N H 2 O N a H N N N H O H O N H N 1 5 N H 2 O H O
  23. 23. A3DAE0788050DDE4-FICuS E5F83F10C5DB080A -FICuS B2FDA68AEDA06DB9-FICuS 9850FD9F9E2B4E25 -FICuS E5F83F10C5DB080A -FICuS E92E4BA2869F3611-FICuS 8A7AD1EB498CC76A-FICuS 9850FD9F9E2B4E25 -FICuS H N N N H 2 O - O N a + 9850FD9F9E2B4E25 -FICuS charged form tautomers isotope salt stereoisomers FICuS “ errors” H N N N H 2 O H O N N H N H 2 O H O H N N O H O N H 2 H N N O H O N H 2 H N N N H 3 + O - O O H N N N H 2 O N a H N N N H O H O N H N 1 5 N H 2 O H O
  24. 24. 9850FD9F9E2B4E25 -uuuuu 9850FD9F9E2B4E25 -uuuuu 9850FD9F9E2B4E25 -uuuuu 9850FD9F9E2B4E25 -FICuS 9850FD9F9E2B4E25 -uuuuu 9850FD9F9E2B4E25 -uuuuu 9850FD9F9E2B4E25 -uuuuu 9850FD9F9E2B4E25 -uuuuu H N N N H 2 O - O N a + 9850FD9F9E2B4E25 -uuuuu charged form tautomers isotope stereoisomers salt uuuuu “ errors” H N N N H 2 O H O N N H N H 2 O H O H N N O H O N H 2 H N N O H O N H 2 H N N N H 3 + O - O O H N N N H 2 O N a H N N N H O H O N H N 1 5 N H 2 O H O
  25. 25. NCI/CADD Chemical Structure Database NCI/CADD:RID NCI/CADD:CID structure records compounds (structures unique by CACTVS HASHISY) FICTS associations ~72.0 million FICuS associations ~70.6 million uuuuu associations ~65.3 million 103.5 million 83.6 million ~130 million linkouts to original database records <ul><li>linked to: </li></ul><ul><li>StdInChI[Key] </li></ul><ul><li>chemical names </li></ul><ul><li>chemical formula </li></ul><ul><li>properties </li></ul><ul><li>etc. </li></ul>
  26. 26. resolver chemical names CAS numbers SMILES strings IUPAC InChI/InChIKeys NCI/CADD Identifiers CACTVS HASHISY NSC number PubChem SID/CID FDA UNII ChemSpider ID ChemNavigator SID Chemical Formula /smiles /names, /iupac_name /cas /inchi, /stdinchi /inchikey, /stdinchikey /ficts, /ficus, /uuuuu /image /file, /sdf /mw, /monoisotopic_mass /formula /twirl, /3d /urls /unii /chemspider_id /pubchem_sid /chemnavigator_sid “ identifier” “ representation” http://cactus.nci.nih.gov/chemcial/structure Chemical Identifier Resolver NCI/CADD Public Web Resources
  27. 27. http://cactus.nci.nih.gov/chemical/structure/ LFQSCWFLJHTTHZ-UHFFFAOYSA-N / smiles Standard InChIKey Chemical Identifier Resolver <ul><li>can resolve ~93.0 million Standard InChIKeys into a full structure representation: </li></ul>CCO http://cactus.nci.nih.gov/chemical/structure/ LFQSCWFLJHTTHZ-UHFFFAOYSA / smiles CCO CC[OH2+] http://cactus.nci.nih.gov/chemical/structure/ LFQSCWFLJHTTHZ / smiles C(C(O)([2H])[2H])[2H] CC(O)([2H])[2H] C(CO)([2H])([2H])[2H] CC[17OH] C(CO)[2H] [14CH3]CO CCO
  28. 28. alc  Alchemy format cdxml  CambridgeSoft ChemDraw XML format cerius  MSI Cerius II format charmm   Chemistry at HARvard Macromolecular Mechanics file format cif  Crystallographic Information File cml  Chemical Markup Language ctx  Gasteiger Clear Text format gjf  Gaussian input data file gromacs  GROMACS file format hyperchem  HyperChem file format jme  Java Molecule Editor format maestro  Schroedinger MacroModel structure file format mol  Symyx molecule file sybyl2/mol2  Tripos Sybyl MOL2 format mrv  ChemAxon MRV format pdb  Protein Data Bank sdf  Symyx Structure Data Format sdf3000  Symyx Structure Data Format 3000 sln  SYBYL Line Notation smiles   SMILES xyz  xyz file format <ul><li>available formats: </li></ul>http://cactus.nci.nih.gov/chemical/structure/ LFQSCWFLJHTTHZ-UHFFFAOYSA-N / file ?format = sdf File Representation Chemical Identifier Resolver
  29. 29. http://cactus.nci.nih.gov/chemical/structure/ buckyball / image ? height= 300 &width= 300 &bgcolor= black &bondcolor= white http://cactus.nci.nih.gov/chemical/structure/ aspirin / image ?height= 200 &width= 200 &symbolfontsize= 7 &footer=&quot; Aspirin &quot; Aspirin Structure Image Generation Chemical Identifier Resolver
  30. 30. TwirlyMol Chemical Identifier Resolver implemented by Noel O'Boyle (University College Cork, Ireland) Chrome Safari FF3.5/3.6 FF3.0 FF2.0 IE8 IE7 IE6 simple javascript that allows you to render a rotatable/zoomable 3D representation of a molecule in your web browser no plugin is needed, only a modern browser:
  31. 31. <ul><li>simple viewer: </li></ul>http://cactus.nci.nih.gov/chemical/structure/ restasis / twirl <ul><li>embed into a web page: </li></ul><div id=“ canvas ” height=“ 400 ” width=“ 400 ”></div> <script src=“ http://cactus.nci.nih.gov/chemical/structure/ restasis / twirl_cached / canvas ” /> TwirlyMol Chemical Identifier Resolver
  32. 32. restasis
  33. 33. http://www.coronene.com/blog/ http://chemical-quantum-images.blogspot.com http://baoilleach.blogspot.com/ TwirlyMol Chemical Identifier Resolver
  34. 34. ethanol name a specific resolver module : http://cactus.nci.nih.gov/chemical/structure/ CCO / iupac_name ?resolver= name 2-[[3-(3-chlorophenyl)-1,2,4-oxadiazol-5-yl]sulfanyl]acetic acid <ul><li>e.g. the string “ CCO ”, can be resolved as </li></ul><ul><ul><li>SMILES string of “ ethanol ” </li></ul></ul><ul><ul><li>abbreviation for “ Carboxymethylthio-3-(3-Chlorphenyl)-1,2,4-Oxadiazol) ” </li></ul></ul>Ambiguous Identifiers Chemical Identifier Resolver http://cactus.nci.nih.gov/chemical/structure/ CCO / iupac_name ?resolver= smiles
  35. 35. < ?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot; ? > < request string=&quot; CCO &quot; representation=“ iupac_name &quot; > < data id=&quot; 1 &quot; resolver=&quot; smiles &quot; string_class=&quot; SMILES String &quot;> < item id=&quot; 1 &quot;> ethanol < / item > < / data > < data id=&quot; 2 &quot; resolver=&quot; name &quot; string_class=&quot; Chemical Name &quot; > < item id=&quot; 1 &quot; > 2-[[3-(3-chlorophenyl)-1,2,4-oxadiazol-5-yl]sulfanyl]acetic acid < / item > < / data > < / request > XML format: <ul><li>e.g. the string “ CCO ”, can be resolved as </li></ul><ul><ul><li>SMILES string of “ ethanol ” </li></ul></ul><ul><ul><li>abbreviation for “ Carboxymethylthio-3-(3-Chlorphenyl)-1,2,4-Oxadiazol) ” </li></ul></ul>Chemical Identifier Resolver Ambiguous Identifiers http://cactus.nci.nih.gov/chemical/structure/ CCO / iupac_name /xml
  36. 36. < ?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot; ? > < request string=&quot; restasis &quot; representation=&quot; urls &quot;> < data id=&quot; 1 &quot; resolver=&quot; name &quot; string_class=&quot; Chemical Name &quot;> < item id=&quot; 1 &quot; classification=&quot; exact &quot; database=&quot; ChemSpider &quot; publisher=&quot; ChemSpider &quot;> http://chemspider.com/structure.4939506 < /item > < item id=&quot; 2 &quot; classification=&quot; exact &quot; database=&quot; ChemSpider “ publisher=&quot; PubChem &quot;> http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?sid=43028058 < /item > < item id=&quot; 3 &quot; classification=&quot; exact &quot; database=&quot; NLM ChemIDplus &quot; publisher=&quot; NLM &quot;> http://chem.sis.nlm.nih.gov/chemidplus/direct.jsp?result=advanced&regno=059865133 […] < /data > < /request > <ul><li>get the URL of the original structure records: </li></ul>http://cactus.nci.nih.gov/chemical/structure/ restasis / urls /xml Chemical Identifier Resolver Database URL Lookup
  37. 37. <ul><li>get available names: </li></ul>http://cactus.nci.nih.gov/chemical/structure/ CC (= O)Oc1ccccc1C(O)=O/ names /xml Chemical Identifier Resolver Name Lookup <?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot; ? > < request string=&quot; CC(=O)Oc1ccccc1C(O)=O &quot; representation=&quot; names &quot;> < data id=&quot; 1 &quot; resolver=&quot; smiles &quot; string_class=&quot; SMILES String &quot; description=&quot; CC(=O)Oc1ccccc1C(O)=O &quot; > < item id=&quot; 1 &quot; classification =&quot; PUBCHEM_IUPAC_NAME &quot;> 2-acetyloxybenzoic acid < /item > < item id=&quot; 2 &quot; classification=&quot; PUBCHEM_IUPAC_OPENEYE_NAME &quot;> 2-Acetoxybenzoic acid < /item > < item id=&quot; 3 &quot; classification=&quot; PUBCHEM_GENERIC_REGISTRY_NAME &quot;> 50-78-2 < /item > < item id=&quot; 4 &quot; classification=&quot; PUBCHEM_GENERIC_REGISTRY_NAME &quot;> 11126-35-5 </ item > < item id=&quot; 5 &quot; classification=&quot; PUBCHEM_GENERIC_REGISTRY_NAME &quot;> 11126-37-7 </ item > < item id=&quot; 6 &quot; classification=&quot; PUBCHEM_GENERIC_REGISTRY_NAME &quot;> 2349-94-2 </ item > < item id=&quot; 7 &quot; classification=&quot; PUBCHEM_GENERIC_REGISTRY_NAME &quot;> 26914-13-6 </ item > < item id=&quot; 8 &quot; classification=&quot; PUBCHEM_SUBSTANCE_SYNONYM &quot;> NCGC00090977-04 </ item > < item id=&quot; 9 &quot; classification=&quot; PUBCHEM_SUBSTANCE_SYNONYM &quot;> KBioSS_002272 </ item > < item id=&quot; 10 &quot; classification=&quot; PUBCHEM_SUBSTANCE_SYNONYM &quot;> SBB015069 </ item > < item id=&quot; 11 &quot; classification=&quot; PUBCHEM_SUBSTANCE_SYNONYM &quot;> Aspirin </ item > < item id=&quot; 12 &quot; classification=&quot; PUBCHEM_SUBSTANCE_SYNONYM &quot;> D00109 </ item > […]
  38. 38. http://cactus.nci.nih.gov/blog /chemical/structure Blog
  39. 39. In Development http://cactus.nci.nih.gov/ TEST_ chemical/structure
  40. 40. <ul><li>manipulates the structure created from the identifier </li></ul><ul><li>new representation is calculated after structure manipulation </li></ul>http://cactus.nci.nih.gov/chemical/structure/ operator: identifier/representation “ Chemical Operators” Chemical Identifier Resolver operators: tautomers, canonical_tautomer, addh, removeh, nostereo, rings, …
  41. 41. Tautomers “ Chemical Operator” http://cactus.nci.nih.gov/chemical/structure/ tautomers :guanine /” representation ” N N H N H N O H 2 N N N H N H N O H 2 N N N H N N O H H 2 N H N N N H N O H 2 N N N N H N O H H 2 N H N N N H N O H 2 N N N N H N O H H 2 N H N N N N O H H 2 N H N N H N H N O H N N N H N H N O H H N H N N H N H N O H N N N H N H N O H H N H N N H N N O H H N H N N N H N O H H N H N N N H N O H H N
  42. 42. <ul><li>(hopefully) there will be many resolvers from different providers with different background: </li></ul><ul><ul><li>publishers </li></ul></ul><ul><ul><li>commercial databases </li></ul></ul><ul><ul><li>free sources and databases: ChemSpider, PubChem, ChEBI, … </li></ul></ul><ul><li>Std. InChI[Key] is the perfect tool to interlink the resolvers </li></ul><ul><li>ChemSpider and NCI/CADD are working on a test protocol for a federated InChI/InChIKey resolver </li></ul>IUPAC InChI/InChIKey Resolver
  43. 43. IUPAC InChI/InChIKey Resolver IUPAC Root Resolver Resolver 1 Resolver 2 Resolver 3 Resolver 3.1 Resolver 3.2 Resolver 3.3 Clients Chemical Identifier Resolver
  44. 44. http://cactus.nci.nih.gov/chemical/structure Chemical Identifier Resolver NCI/CADD Web Resources http://cactus.nci.nih.gov/blog
  45. 45. Acknowledgments ChemNavigator Scott Hutton Tad Hurst CADD Group, CBL, NCI Igor Filippov Noel O'Boyle Hans-Juergen Himmler (Akos) Thanks to all database providers! http://cactus.nci.nih.gov Our web site:
  46. 46. Users webel.py - A Cinfony module IUPHAR DATABASE http://www.iuphar-db.org http://baoilleach.blogspot.com/2009/11/introducing-webel-cheminformatics.html http://www.akosgmbh.eu/globalsearch/index.htm avogadro.openmolecules.net/ CACTVS http://www.xemistry.com in silico toxicology http://www.in-silico.ch/ Symyx Draw Resolver http://www.symyx.com/
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×