Automated QSAR

1,029 views

Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Automated QSAR

  1. 1. QSAR Process requires many choices<br />Which descriptors?<br />Which modelling algorithm?<br />What model testing strategy?<br />Quality of result depends on make correct choices<br />All runs are different<br />Discovery Bus manages this process<br />Apply everything approach<br />QSAR choices<br />
  2. 2. The Discovery Bus<br />Manages the many model generation paths <br />Random split<br />80:20 split<br />Partition training & test data<br />Java CDK descriptors<br />C++ CDL descriptors<br />Calculate descriptors<br />Correlation analysis<br />Genetic algorithms<br />Random selection<br />Select descriptors<br />Linear regression<br />Neural Network<br />Partial Least Squares<br />Classification Trees<br />Build model<br />Add to database<br />
  3. 3. Filter Features<br />QSAR Agent<br />Model Build<br />Filter feature<br />request ...<br />responses<br />Model build<br />request ...<br />responses<br />Calculate descriptors<br />request ...<br />responses<br />Calculate Descriptors<br />
  4. 4. Filter Features<br />QSAR Agent<br />Model Build<br />Calculate<br />Descriptors<br />Filter feature<br />request ...<br />responses<br />responses<br />Model build<br />request ...<br />responses<br />Calculate descriptors<br />request ...<br />responses<br />responses<br />responses<br />Calculate Descriptors<br />
  5. 5. Industrial Scale QSAR<br />Predict likely properties based on similar molecules<br />CHEMBL Database:<br />data on 622,824 compounds,<br />collected from 33,956 publications <br />WOMBAT Database:<br />data on 251,560 structures,<br />for over 1,966 targets<br />WOMBAT-PK Database:<br />data on 1230 compounds,<br />for over 13,000 clinical measurements<br />Project Junior (Newcastle University & Microsoft Research)<br />10,000 datasets gave 750,000 QSAR models in 3 weeks using 100 Azure Cloud Servers<br />
  6. 6. “The Discovery Bus is not a tool for users. It is a system for doing drug design independent of any user”<br />The ambition is a step-change in productivity arising from breaking the link between human effort and drug discovery output.<br />

×