“ Online chemical database
  with modeling environment”
a summer school course


Sergii Novotarskyi
Iurii Sushko
Chemoinformatics – overview of online resources
Chemical databases

1. PubChem — a database that provides information on t...
Chemoinformatics – overview of online resources
Literature databases

6. PubMed — a service, that includes over 19 million...
Chemoinformatics – overview of online resources
PubChem – start page




URL: http://pubchem.ncbi.nlm.nih.gov/ or   for «P...
Chemoinformatics – overview of online resources
PubChem – search results
Chemoinformatics – overview of online resources
PubChem – compound details
Chemoinformatics – overview of online resources
PubChem – bioassay search results
Chemoinformatics – overview of online resources
ChemSpider – start page




URL: http://www.chemspider.com/ or     for «Ch...
Chemoinformatics – overview of online resources
ChemSpider – search results
Chemoinformatics – overview of online resources
ChemIdPlus – main page




        URL: http://chem.sis.nlm.nih.gov/chemid...
Chemoinformatics – overview of online resources
ChemIdPlus – search results
Chemoinformatics – overview of online resources
ChemBank – main page




URL: http://chembank.broadinstitute.org/ or   for...
Chemoinformatics – overview of online resources
ChemBank – search results
Chemoinformatics – overview of online resources
ChemDB – main page




URL: http://cdb.ics.uci.edu/ or   for «ChemDB»
Chemoinformatics – overview of online resources
ChemDB – search results
Chemoinformatics – overview of online resources
PubMed – main page




URL: http://www.ncbi.nlm.nih.gov/pubmed/ or       f...
Online chemical database with modeling environment
The subject of development



      The web-based service
         The ...
Online chemical database with modeling environment
Motivation




      Our motivation
         The importance of QSPR mod...
Online chemical database with modeling environment
Motivation - QSPR

      Structure-property relationship hypothesis:
  ...
Online chemical database with modeling environment
QSPR – Similarity in descriptor space

     Number of specific fragment...
Online chemical database with modeling environment
Motivation - web-based tools for modeling




      Main benefits of we...
Online chemical database with modeling environment
Motivation - one more web-based tool




      Reasons to build one mor...
Online chemical database with modeling environment
Distinctive features

       The features, that make our service differ...
Online chemical database with modeling environment
Data structure
Online chemical database with modeling environment
Simplified data structure

      Records                               ...
Online chemical database with modeling environment
User interface agreements

Browser-based interface
Online chemical database with modeling environment
User interface agreements

Browser-based interface
Online chemical database with modeling environment
User interface agreements

Icons
        Edit current record (item, art...
Online chemical database with modeling environment
Summary

    The database currently contains:

          More than 5000...
Thank you
Online chemical database with modeling environment
Practical course - outline

•      Collection of data from original lit...
Online chemical database with modeling environment
Practical course – collection of data – before we start

      Article ...
Online chemical database with modeling environment
Practical course – collection of data

The goal: achive data on CYP450 ...
Online chemical database with modeling environment
Practical course – data collection

      Article name                 ...
Online chemical database with modeling environment
Practical course – data introduction – cheat sheet

Good chemistry look...
Online chemical database with modeling environment
Practical course – batch data introduction – template

•   CASRN — CAS ...
Online chemical database with modeling environment
Practical course – batch data introduction – cheat sheet
   •   Article...
Thank you (once more)
Upcoming SlideShare
Loading in …5
×

Online Chemical Database with Modelling Environment

1,633 views

Published on

AACIMP 2009 Summer School lecture by Yuriy Sushko and Sergii Novotarskyi. "Environmental Chemoinfornatics" course.

Published in: Education, Technology, Business
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,633
On SlideShare
0
From Embeds
0
Number of Embeds
99
Actions
Shares
0
Downloads
37
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Online Chemical Database with Modelling Environment

  1. 1. “ Online chemical database with modeling environment” a summer school course Sergii Novotarskyi Iurii Sushko
  2. 2. Chemoinformatics – overview of online resources Chemical databases 1. PubChem — a database that provides information on the biological activities of small molecules 2. ChemSpider — a free access service providing a structure centric community for chemists 3. ChemIDplus — a tool, that provides chemical structure, property, and toxicity searching 4. ChemBank — a database of chemical structures and assays 5. ChemDB — a set of chemoinformatics tools
  3. 3. Chemoinformatics – overview of online resources Literature databases 6. PubMed — a service, that includes over 19 million citations from MEDLINE and other life science journals for biomedical articles back to 1948 7. Toxicology Literature Online (TOXLINE) — references from toxicology literature 8. ScienceDirect — a full-text scientific database offering articles/chapters from more than 2,500 peer-reviewed journals and more than 10,000 books 9. ACS Publications — a worldwide scientific community with a collection of the most cited peer-reviewed journals in the chemical and related sciences.
  4. 4. Chemoinformatics – overview of online resources PubChem – start page URL: http://pubchem.ncbi.nlm.nih.gov/ or for «PubChem»
  5. 5. Chemoinformatics – overview of online resources PubChem – search results
  6. 6. Chemoinformatics – overview of online resources PubChem – compound details
  7. 7. Chemoinformatics – overview of online resources PubChem – bioassay search results
  8. 8. Chemoinformatics – overview of online resources ChemSpider – start page URL: http://www.chemspider.com/ or for «ChemSpider»
  9. 9. Chemoinformatics – overview of online resources ChemSpider – search results
  10. 10. Chemoinformatics – overview of online resources ChemIdPlus – main page URL: http://chem.sis.nlm.nih.gov/chemidplus/ for «ChemIdPlus»
  11. 11. Chemoinformatics – overview of online resources ChemIdPlus – search results
  12. 12. Chemoinformatics – overview of online resources ChemBank – main page URL: http://chembank.broadinstitute.org/ or for «ChemBank»
  13. 13. Chemoinformatics – overview of online resources ChemBank – search results
  14. 14. Chemoinformatics – overview of online resources ChemDB – main page URL: http://cdb.ics.uci.edu/ or for «ChemDB»
  15. 15. Chemoinformatics – overview of online resources ChemDB – search results
  16. 16. Chemoinformatics – overview of online resources PubMed – main page URL: http://www.ncbi.nlm.nih.gov/pubmed/ or for «PubMed»
  17. 17. Online chemical database with modeling environment The subject of development The web-based service The database of physical, chemical and biological properties Accumulating experimentally verified data Providing user-friendly web-based access to this data The QSPR modeling environment Providing web-based tools for QSPR modeling Storing and “publishing” created models
  18. 18. Online chemical database with modeling environment Motivation Our motivation The importance of QSPR modeling The importance of web-based tools for QSPR modeling The importance to build one more service in this field
  19. 19. Online chemical database with modeling environment Motivation - QSPR Structure-property relationship hypothesis: “Similar structures - similar properties” log (IC50) = log (IC50) = 1.87 log(µM) 1.87 log(µM) QSPR modeling: Predicting properties based on available data for structurally similar molecules. Structures are represented by a set of descriptors (atom count, molecular log (IC50) = log (IC50) = ? weight). 0.64 log(µM)
  20. 20. Online chemical database with modeling environment QSPR – Similarity in descriptor space Number of specific fragments in a molecule
  21. 21. Online chemical database with modeling environment Motivation - web-based tools for modeling Main benefits of web-based tools: Availability and accessibility only a computer with Internet access and a modern web-browser required to start working; possibility to share work materials among several locations; works with any platform (Linux, Win, Mac) Communication and collaboration possibility to work on common topics, publish own results and use new results of other people
  22. 22. Online chemical database with modeling environment Motivation - one more web-based tool Reasons to build one more service: Different approach to data modification a completely open database, any user can add, delete and edit data (only constrained by a set of simple rules) Different approach to data organization data in the database is organized in a way, suitable for QSPR modeling Integration of a database with modeling tools data from the database can be used for model creation and property prediction
  23. 23. Online chemical database with modeling environment Distinctive features The features, that make our service different: “Wiki” approach to data handling users can add, modify and delete data Mandatory reference to an article every record in a database should contain a reference to an article, where the data was published Storing additional information we store measurement conditions to increase data quality Several tools to support decision making integration with other web-services (validation of molecule names against PubChem database, automatic fetching of article information from PubMed), duplicate records management Aimed at model building convenient to build training sets from data - filter by property, article and export data either to internal modeling tools or download as Excel file
  24. 24. Online chemical database with modeling environment Data structure
  25. 25. Online chemical database with modeling environment Simplified data structure Records Properties Conditions Molecules Users Units Articles Journals
  26. 26. Online chemical database with modeling environment User interface agreements Browser-based interface
  27. 27. Online chemical database with modeling environment User interface agreements Browser-based interface
  28. 28. Online chemical database with modeling environment User interface agreements Icons Edit current record (item, article, unit, etc.) Delete current record Most places — open record-specific submenu, sometimes — view profile Open a wiki page with additional explanations Send a message to the user Download data in XLS format Select item
  29. 29. Online chemical database with modeling environment Summary The database currently contains: More than 50000 records Around 285 properties More than 2700 articles
  30. 30. Thank you
  31. 31. Online chemical database with modeling environment Practical course - outline • Collection of data from original literature • Use of publicly available tools for literature and cmemical structure lookup • Introduction of data to OCHEM — single record • Collection of data from benchmark literature • Introduction of data to OCHEM — batch upload
  32. 32. Online chemical database with modeling environment Practical course – collection of data – before we start Article name PubMedID Compound name Value 1 2 3 4 5
  33. 33. Online chemical database with modeling environment Practical course – collection of data The goal: achive data on CYP450 1A2 inhibitors and noninhibitors Cytochrome P450 (abbreviated CYP, P450, CYP450) is a very large and diverse superfamily of hemoproteins found in all domains of life. © Wikipedia PubMed search terms: CYP1A2 inhibition
  34. 34. Online chemical database with modeling environment Practical course – data collection Article name PubMedID Compound name CYP Modulation 1 Chemical genomics of •3H-1,2-dithiole-3-thione Inhibitor cancer chemopreventive 19126641 •4-methyl-5-pyrazinyl-3H-1,2-dithiole-3-thione Noninhibitor dithiolethiones •5-tert-butyl-3H-1,2-dithiole-3-thione Noninhibitor 2 Comprehensive in vitro Noninhibitor analysis of voriconazole 19029318 Voriconazole inhibition of eight cytochrome P450 (CYP) enzymes: major effect on CYPs 2B6, 2C9, 2C19, and 3A 3 Involvement of CYP1A2 in Mexiletine Inhibitor mexiletine metabolism 9690950 4 Differential inhibition of Indinavir Noninhibitor cytochrome P450 isoforms by 9278209 the protease inhibitors, ritonavir, saquinavir and indinavir 5 An evaluation of potential Clorgyline Inhibitor mechanism-based inactivation 16669850 of human drug metabolizing cytochromes P450 by monoamine oxidase inhibitors,including isoniazid.
  35. 35. Online chemical database with modeling environment Practical course – data introduction – cheat sheet Good chemistry lookup engine: PubChem (find URL in Google.com) We search by name, and want to get structure Convenient structure representation - SMILES Property: CYP450 Modulation Condition: CYP450 Type = CYP1A2
  36. 36. Online chemical database with modeling environment Practical course – batch data introduction – template • CASRN — CAS registration number • SMILES — smiles string • NAME — molecule name • ARTICLEID — article identifier (PubMed or OCHEM) • PAGE — article page • TABLE — article table • LINE — article line • COMMENT — text comment • REFERENCE — record reference • CYP450 Modulation — value of the property • Unit — measurment unit of the property • Accuracy — measurment accuracy • Interval — measurmen interval • CYP450 Type — record condition
  37. 37. Online chemical database with modeling environment Practical course – batch data introduction – cheat sheet • Article URL: http://tinyurl.com/rendic • Article title: «Summary of information on human CYP enzymes: human P450 metabolism data» • Good chemistry lookup engine: PubChem (find URL in Google.com) • We search by name, and want to get structure • Convenient structure representation - SMILES • Property: CYP450 Modulation • Condition: CYP450 Type = CYP1A2 • Reference = 1 • ArticleID = Q1592 • Batch upload template URL: http://tinyurl.com/bu-template
  38. 38. Thank you (once more)

×