SlideShare a Scribd company logo
QSAR Process requires many choices Which descriptors? Which modelling algorithm? What model testing strategy? Quality of result depends on make correct choices All runs are different Discovery Bus manages this process Apply everything approach QSAR choices
The Discovery Bus Manages the many model generation paths  Random split 80:20 split Partition training & test data Java CDK descriptors C++ CDL descriptors Calculate descriptors Correlation analysis Genetic algorithms Random selection Select descriptors Linear regression Neural Network Partial Least Squares Classification Trees Build model Add to database
Filter Features QSAR Agent Model Build Filter feature request ... responses Model build request ... responses Calculate descriptors request ... responses Calculate Descriptors
Filter Features QSAR Agent Model Build Calculate Descriptors Filter feature request ... responses responses Model build request ... responses Calculate descriptors request ... responses responses responses Calculate Descriptors
Industrial Scale QSAR Predict likely properties based on similar molecules CHEMBL Database: data on 622,824 compounds, collected from 33,956 publications  WOMBAT Database: data on 251,560 structures, for over 1,966 targets WOMBAT-PK Database: data on 1230 compounds, for over 13,000 clinical measurements Project Junior (Newcastle University & Microsoft Research) 10,000 datasets gave 750,000 QSAR models in 3 weeks using 100 Azure Cloud Servers
“The Discovery Bus is not a tool for users. It is a system for doing drug design independent of any user” The ambition is a step-change in productivity arising from breaking the link between human effort and drug discovery output.

More Related Content

Viewers also liked

Qsar studies of saponine analogues for anticancer activity by sagar alone
Qsar studies of saponine analogues for anticancer activity by sagar aloneQsar studies of saponine analogues for anticancer activity by sagar alone
Qsar studies of saponine analogues for anticancer activity by sagar alone
sagar alone
 
QSAR Study on Antitubercular Drug Derivatives
QSAR Study on Antitubercular Drug DerivativesQSAR Study on Antitubercular Drug Derivatives
QSAR Study on Antitubercular Drug Derivatives
Lydia Yeshitla
 
25.qsar
25.qsar25.qsar
Qsar lecture
Qsar lectureQsar lecture
Qsar lecture
shishirkawde
 
Qsar
QsarQsar
Computer aided drug designing
Computer aided drug designingComputer aided drug designing
Computer aided drug designing
Muhammed sadiq
 
Computer aided drug designing
Computer aided drug designing Computer aided drug designing
Computer aided drug designing
Ayesha Aftab
 
QSAR : Activity Relationships Quantitative Structure
QSAR : Activity Relationships Quantitative StructureQSAR : Activity Relationships Quantitative Structure
QSAR : Activity Relationships Quantitative Structure
Saramita De Chakravarti
 
Qsar
QsarQsar
Qsar
Rahul B S
 
Qsar and drug design ppt
Qsar and drug design pptQsar and drug design ppt
Qsar and drug design ppt
Abhik Seal
 
Computer Aided Drug Design ppt
Computer Aided Drug Design pptComputer Aided Drug Design ppt
Computer Aided Drug Design ppt
Hanumant Suryawanshi
 
SAR of Morphine
SAR of MorphineSAR of Morphine
SAR of Morphine
Nabiilah Naraino Majie
 
Computer aided drug design
Computer aided drug designComputer aided drug design
Computer aided drug design
N K
 

Viewers also liked (13)

Qsar studies of saponine analogues for anticancer activity by sagar alone
Qsar studies of saponine analogues for anticancer activity by sagar aloneQsar studies of saponine analogues for anticancer activity by sagar alone
Qsar studies of saponine analogues for anticancer activity by sagar alone
 
QSAR Study on Antitubercular Drug Derivatives
QSAR Study on Antitubercular Drug DerivativesQSAR Study on Antitubercular Drug Derivatives
QSAR Study on Antitubercular Drug Derivatives
 
25.qsar
25.qsar25.qsar
25.qsar
 
Qsar lecture
Qsar lectureQsar lecture
Qsar lecture
 
Qsar
QsarQsar
Qsar
 
Computer aided drug designing
Computer aided drug designingComputer aided drug designing
Computer aided drug designing
 
Computer aided drug designing
Computer aided drug designing Computer aided drug designing
Computer aided drug designing
 
QSAR : Activity Relationships Quantitative Structure
QSAR : Activity Relationships Quantitative StructureQSAR : Activity Relationships Quantitative Structure
QSAR : Activity Relationships Quantitative Structure
 
Qsar
QsarQsar
Qsar
 
Qsar and drug design ppt
Qsar and drug design pptQsar and drug design ppt
Qsar and drug design ppt
 
Computer Aided Drug Design ppt
Computer Aided Drug Design pptComputer Aided Drug Design ppt
Computer Aided Drug Design ppt
 
SAR of Morphine
SAR of MorphineSAR of Morphine
SAR of Morphine
 
Computer aided drug design
Computer aided drug designComputer aided drug design
Computer aided drug design
 

Similar to Automated QSAR

The Developer Data Scientist – Creating New Analytics Driven Applications usi...
The Developer Data Scientist – Creating New Analytics Driven Applications usi...The Developer Data Scientist – Creating New Analytics Driven Applications usi...
The Developer Data Scientist – Creating New Analytics Driven Applications usi...
Microsoft Tech Community
 
Discovery Bus: UK QSAR meeting at GSK
Discovery Bus: UK QSAR meeting at GSKDiscovery Bus: UK QSAR meeting at GSK
Discovery Bus: UK QSAR meeting at GSK
David Leahy
 
Integrative information management for systems biology
Integrative information management for systems biologyIntegrative information management for systems biology
Integrative information management for systems biology
Neil Swainston
 
Building, Debugging, and Tuning Spark Machine Leaning Pipelines-(Joseph Bradl...
Building, Debugging, and Tuning Spark Machine Leaning Pipelines-(Joseph Bradl...Building, Debugging, and Tuning Spark Machine Leaning Pipelines-(Joseph Bradl...
Building, Debugging, and Tuning Spark Machine Leaning Pipelines-(Joseph Bradl...
Spark Summit
 
BioWeka
BioWekaBioWeka
BioWeka
Martin Szugat
 
Practical Distributed Machine Learning Pipelines on Hadoop
Practical Distributed Machine Learning Pipelines on HadoopPractical Distributed Machine Learning Pipelines on Hadoop
Practical Distributed Machine Learning Pipelines on Hadoop
DataWorks Summit
 
Barga Data Science lecture 8
Barga Data Science lecture 8Barga Data Science lecture 8
Barga Data Science lecture 8
Roger Barga
 
Recommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareRecommendations for Building Machine Learning Software
Recommendations for Building Machine Learning Software
Justin Basilico
 
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
MLconf
 
Guiding through a typical Machine Learning Pipeline
Guiding through a typical Machine Learning PipelineGuiding through a typical Machine Learning Pipeline
Guiding through a typical Machine Learning Pipeline
Michael Gerke
 
All Streams Ahead! ksqlDB Workshop ANZ
All Streams Ahead! ksqlDB Workshop ANZAll Streams Ahead! ksqlDB Workshop ANZ
All Streams Ahead! ksqlDB Workshop ANZ
confluent
 
Lessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixLessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at Netflix
Justin Basilico
 
Joseph Bradley, Software Engineer, Databricks Inc. at MLconf SEA - 5/01/15
Joseph Bradley, Software Engineer, Databricks Inc. at MLconf SEA - 5/01/15Joseph Bradley, Software Engineer, Databricks Inc. at MLconf SEA - 5/01/15
Joseph Bradley, Software Engineer, Databricks Inc. at MLconf SEA - 5/01/15
MLconf
 
Mechanisms for Database Intrusion Detection and Response
Mechanisms for Database Intrusion Detection and ResponseMechanisms for Database Intrusion Detection and Response
Mechanisms for Database Intrusion Detection and Response
Ashish Kamra
 
Machine Learning and AI: Core Methods and Applications
Machine Learning and AI: Core Methods and ApplicationsMachine Learning and AI: Core Methods and Applications
Machine Learning and AI: Core Methods and Applications
QuantUniversity
 
Tooling for Machine Learning: AWS Products, Open Source Tools, and DevOps Pra...
Tooling for Machine Learning: AWS Products, Open Source Tools, and DevOps Pra...Tooling for Machine Learning: AWS Products, Open Source Tools, and DevOps Pra...
Tooling for Machine Learning: AWS Products, Open Source Tools, and DevOps Pra...
SQUADEX
 
Weka : A machine learning algorithms for data mining
Weka : A machine learning algorithms for data miningWeka : A machine learning algorithms for data mining
Weka : A machine learning algorithms for data mining
Keshab Kumar Gaurav
 
Spark Summit EU talk by Michael Nitschinger
Spark Summit EU talk by Michael NitschingerSpark Summit EU talk by Michael Nitschinger
Spark Summit EU talk by Michael Nitschinger
Spark Summit
 
Recommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareRecommendations for Building Machine Learning Software
Recommendations for Building Machine Learning Software
Justin Basilico
 
Chemical workflows supporting automated research data collection
Chemical workflows supporting automated research data collectionChemical workflows supporting automated research data collection
Chemical workflows supporting automated research data collection
Valery Tkachenko
 

Similar to Automated QSAR (20)

The Developer Data Scientist – Creating New Analytics Driven Applications usi...
The Developer Data Scientist – Creating New Analytics Driven Applications usi...The Developer Data Scientist – Creating New Analytics Driven Applications usi...
The Developer Data Scientist – Creating New Analytics Driven Applications usi...
 
Discovery Bus: UK QSAR meeting at GSK
Discovery Bus: UK QSAR meeting at GSKDiscovery Bus: UK QSAR meeting at GSK
Discovery Bus: UK QSAR meeting at GSK
 
Integrative information management for systems biology
Integrative information management for systems biologyIntegrative information management for systems biology
Integrative information management for systems biology
 
Building, Debugging, and Tuning Spark Machine Leaning Pipelines-(Joseph Bradl...
Building, Debugging, and Tuning Spark Machine Leaning Pipelines-(Joseph Bradl...Building, Debugging, and Tuning Spark Machine Leaning Pipelines-(Joseph Bradl...
Building, Debugging, and Tuning Spark Machine Leaning Pipelines-(Joseph Bradl...
 
BioWeka
BioWekaBioWeka
BioWeka
 
Practical Distributed Machine Learning Pipelines on Hadoop
Practical Distributed Machine Learning Pipelines on HadoopPractical Distributed Machine Learning Pipelines on Hadoop
Practical Distributed Machine Learning Pipelines on Hadoop
 
Barga Data Science lecture 8
Barga Data Science lecture 8Barga Data Science lecture 8
Barga Data Science lecture 8
 
Recommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareRecommendations for Building Machine Learning Software
Recommendations for Building Machine Learning Software
 
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
 
Guiding through a typical Machine Learning Pipeline
Guiding through a typical Machine Learning PipelineGuiding through a typical Machine Learning Pipeline
Guiding through a typical Machine Learning Pipeline
 
All Streams Ahead! ksqlDB Workshop ANZ
All Streams Ahead! ksqlDB Workshop ANZAll Streams Ahead! ksqlDB Workshop ANZ
All Streams Ahead! ksqlDB Workshop ANZ
 
Lessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at NetflixLessons Learned from Building Machine Learning Software at Netflix
Lessons Learned from Building Machine Learning Software at Netflix
 
Joseph Bradley, Software Engineer, Databricks Inc. at MLconf SEA - 5/01/15
Joseph Bradley, Software Engineer, Databricks Inc. at MLconf SEA - 5/01/15Joseph Bradley, Software Engineer, Databricks Inc. at MLconf SEA - 5/01/15
Joseph Bradley, Software Engineer, Databricks Inc. at MLconf SEA - 5/01/15
 
Mechanisms for Database Intrusion Detection and Response
Mechanisms for Database Intrusion Detection and ResponseMechanisms for Database Intrusion Detection and Response
Mechanisms for Database Intrusion Detection and Response
 
Machine Learning and AI: Core Methods and Applications
Machine Learning and AI: Core Methods and ApplicationsMachine Learning and AI: Core Methods and Applications
Machine Learning and AI: Core Methods and Applications
 
Tooling for Machine Learning: AWS Products, Open Source Tools, and DevOps Pra...
Tooling for Machine Learning: AWS Products, Open Source Tools, and DevOps Pra...Tooling for Machine Learning: AWS Products, Open Source Tools, and DevOps Pra...
Tooling for Machine Learning: AWS Products, Open Source Tools, and DevOps Pra...
 
Weka : A machine learning algorithms for data mining
Weka : A machine learning algorithms for data miningWeka : A machine learning algorithms for data mining
Weka : A machine learning algorithms for data mining
 
Spark Summit EU talk by Michael Nitschinger
Spark Summit EU talk by Michael NitschingerSpark Summit EU talk by Michael Nitschinger
Spark Summit EU talk by Michael Nitschinger
 
Recommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareRecommendations for Building Machine Learning Software
Recommendations for Building Machine Learning Software
 
Chemical workflows supporting automated research data collection
Chemical workflows supporting automated research data collectionChemical workflows supporting automated research data collection
Chemical workflows supporting automated research data collection
 

Recently uploaded

TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
ScyllaDB
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
Postman
 
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
FREE A4 Cyber Security Awareness  Posters-Social Engineering part 3FREE A4 Cyber Security Awareness  Posters-Social Engineering part 3
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
Data Hops
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
Dinusha Kumarasiri
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
Javier Junquera
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Tatiana Kojar
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
Tatiana Kojar
 

Recently uploaded (20)

TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
 
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
FREE A4 Cyber Security Awareness  Posters-Social Engineering part 3FREE A4 Cyber Security Awareness  Posters-Social Engineering part 3
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
 

Automated QSAR

  • 1. QSAR Process requires many choices Which descriptors? Which modelling algorithm? What model testing strategy? Quality of result depends on make correct choices All runs are different Discovery Bus manages this process Apply everything approach QSAR choices
  • 2. The Discovery Bus Manages the many model generation paths Random split 80:20 split Partition training & test data Java CDK descriptors C++ CDL descriptors Calculate descriptors Correlation analysis Genetic algorithms Random selection Select descriptors Linear regression Neural Network Partial Least Squares Classification Trees Build model Add to database
  • 3. Filter Features QSAR Agent Model Build Filter feature request ... responses Model build request ... responses Calculate descriptors request ... responses Calculate Descriptors
  • 4. Filter Features QSAR Agent Model Build Calculate Descriptors Filter feature request ... responses responses Model build request ... responses Calculate descriptors request ... responses responses responses Calculate Descriptors
  • 5. Industrial Scale QSAR Predict likely properties based on similar molecules CHEMBL Database: data on 622,824 compounds, collected from 33,956 publications WOMBAT Database: data on 251,560 structures, for over 1,966 targets WOMBAT-PK Database: data on 1230 compounds, for over 13,000 clinical measurements Project Junior (Newcastle University & Microsoft Research) 10,000 datasets gave 750,000 QSAR models in 3 weeks using 100 Azure Cloud Servers
  • 6. “The Discovery Bus is not a tool for users. It is a system for doing drug design independent of any user” The ambition is a step-change in productivity arising from breaking the link between human effort and drug discovery output.