• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Chemoinformatics in Action
 

Chemoinformatics in Action

on

  • 883 views

AACIMP 2009 Summer School lecture by Yuriy Sushko and Sergii Novotarskyi. "Environmental Chemoinfornatics" course.

AACIMP 2009 Summer School lecture by Yuriy Sushko and Sergii Novotarskyi. "Environmental Chemoinfornatics" course.

Statistics

Views

Total Views
883
Views on SlideShare
785
Embed Views
98

Actions

Likes
0
Downloads
20
Comments
0

1 Embed 98

http://summerschool.ssa.org.ua 98

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Chemoinformatics in Action Chemoinformatics in Action Presentation Transcript

    • Chemoinformatics in action: some question for audience Yuriy Sushko, Sergii Novotarskyi
    • Practical example Story: A company that produces or intends to produce some particular compound (drug, make up, paint, glue, toilet refresher, whatever..) is obliged to test, if this compound is toxic for human and how toxic it is. What are the options to check this? Teuthrin, Cyclopropanecarboxylic acid
    • Practical example Bioassay Computer modeling In silico: using QSAR (QSPR) based on machine learning to predict In vivo and in vitro assays with properties of interest without direct mice, dogs, rats or other species experiment.
    • Option 1: Bioassay Classical and currently widely used method for measuring toxicity is bioassay with mice, rats, dogs or other species. What are advantages and disadvantages
    • Option 1: Bioassay For bioassay we would typically need: • Dozens of mice for checking several concentrations of tested compound • In some assays we need to wait for next generation • We may need to test against several organisms (rat, mouse) and dierent administration routes (oral, skin, IV injection) • Test can take upto several months • Test would cost upto dozens of thousands dollars What if we need to measure toxicity for 100 000 compounds?
    • Option 2: Modeling What are the steps required to build predictive model for physicochemical or biological property? • Prepare dataset of experimental data • Choose and calculate molecular descriptors • Apply machine learning method
    • Molecular descriptors What is descriptor? Most simple examples? Descriptor is some numerical property of chemical compound. • Simplest constitutional descriptors: MW, NA, nDB, .. • Molecular properties: LogP, hydrophilic factor, .. • Randic molecular profiles • Various topological and 3D indices and profiles
    • Molecular descriptors 2.54 4.25 -5.71 3.26 0.57 -0.07 1.45 6.34 8.28 2.78 -5.67 -2.33 1.45 7.34 8.35 1.64 -5.56 -4.45
    • Machine learning What kind of machine learning methods do you know? • Linear regression • K nearest neighbors (KNN) • Partial Least Regression • Neural networks • Support Vector Machines
    • Some additional facts Popular formats for representing molecules in databases • SDF • SMILES • INCHI
    • SDF — a plain text file benzene ACD/Labs0812062058 header 6 6 0 0 0 0 0 0 0 0 1 V2000 1.9050 -0.7932 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 1.9050 -2.1232 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 0.7531 -0.1282 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 0.7531 -2.7882 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 atom information -0.3987 -0.7932 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 -0.3987 -2.1232 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 2 1 1 0 0 0 0 3 1 2 0 0 0 0 4 2 2 0 0 0 0 5 3 1 0 0 0 0 bond information 6 4 1 0 0 0 0 6 5 2 0 0 0 0 M END $$$$ > <Unique_ID> XCA3464366 > <ClogP> 5.825 tags > <Vendor> Sigma > <Molecular Weight> 499.611
    • SMILES — a string representation C1=CC=C(C=C1)Br CC(F)F COC(C(Cl)Cl)(F) F
    • InChI — one more approach InChI (international chemical identifier) — a standart, developed by IUPAC for a textual identifier of chemical substances InChI: InChI=1S/C6H5Br/c7-6-4-2-1-3-5-6/h1-5H InChIKey: QARVLSVVCXYDNA-UHFFFAOYSA InChI: InChI=1S/C2H4F2/c1-2(3)4/h2H,1H3 InChIKey: NPNPZTNLOVBDOC-UHFFFAOYSA InChI: InChI=1S/C3H4Cl2F2O/c1-8-3(6,7)2(4)5/h2H,1H3 InChIKey: RFKMCNOHBTXSMU-UHFFFAOYSA