SlideShare a Scribd company logo
1 of 35
Automatic Compound Design by
Matched Molecular Pairs
Willem van Hoorn
Senior Solutions Consultant
Professional Services
Contents

•   Matched Molecular Pairs (MMPs)
•   Implementation in PP
•   Reaction Fingerprints
•   Using MMPs as automatic learning machine
Ceci n’est pas une MMP
                Similarity = 0.55 / 0.98 (ECFP_4 / MDL public keys)


                                                               
                                                                                   


                       Sildenafil                                     Vardenafil




  MMP:
  - Single change
  - Typically: 1 or 2 bond cleavage; replace R-group or template
Recent AZ review




                   http://pubs.acs.org/doi/abs/10.1021/jm200452d
MMP as predictor of activity




    Classic QSAR with full molecule descriptors      QSAR using MMP




                                                  pIC50(m-Br to m-Cl-p-F) = -0.19
What have the MMPs done for us?

Classic QSAR / regression
• More generic, can predict >1 change
• Interpretability varies

MMPs
• Can only predict “one step away from known”
• Very interpretable
• Can answer “what to make next” challenge
“Learning Machine” using MMPs
Example of MMP learning machine




                         1  2 transformation applied to
          4
                         compound 3 should yield more
                         attractive compound 4
MMP in Pipeline Pilot


                        Components




                        Protocols




PP 8.5 CU1
PP MMP algorithm based on GSK publication
Test set: EGFR from ChEMBL




              - ChEMBL version 11
              - 4609 IC50 values
              - 3581 compounds

                                    Ed Griffen et al
                                    J Med Chem. 2011, 54, 7739-50
Generate MMPs and transformations

                    >90k       Slow!
                   MMPs in
                  <1 minute




     MMP output
                                            Full transformation




                                       MMP transformation
pIC50 distribution of transformations
     90,343 MMPs yield 180,684 transformations (AB / BA)



                                        bioisosters




              activity cliffs                   activity cliffs




                                       10fold 100fold 1000fold etc
MMP transformations vs. full reactions




 Not specific enough, seen >>1 in         Too specific, seen once in dataset,
 data set but large stddev( pIC50)         pIC50 statistics n=1



                             Would like to have something that describes
                             “reaction centre + nearby environment”


                             Would like increase confidence by looking at similar
                             MMP transformations (with similar pIC50)
PP reaction fingerprints: RCFP




 • RCFP are similar to ECFP, atoms described by:
     Charge
     Hybridization
     Whether the atom is Reactant or Product
     Whether or not the atom is in the “Reaction Site”
 • Need mapped reactions

PP 8.5
Reaction mapping is necessary
                                  Mapped   Unmapped

                                                      All features, no
       Only features describing                       information whether
       reaction site                                  atom is in product or
                                                      reactant
Reaction direction matters




  Reaction fingerprints are not identical A→ B ≠ B → A
MMP transformation as rules



 Context of MMP
 transformation




“Rule” = MMP transformation
Effect = pIC50
Tanimoto seach of MMP transformations

  A single observation…




                                                    pIC50 = 1.3


          pIC50 = 1.9




                                      pIC50 = 1.5




                          … becomes more believable when looking at similars
         pIC50 = 1.8
Express significance as Bayesian probability

                             Bayesian model
                             “Good” molecules:
                              pIC50 ≥ 1




              Rank test set by likelihood transformation will yield
              ≥10fold increase in potency
Bayes can predict MMP 10 fold increase
                                                Enrichment plots of test set
                        100%


                        90%


                        80%


                        70%
                                                                                           • RCFP_6 > RCFP_4
   % Actives Captured




                        60%
                                                                                           • RCFP_4 >> RCFP_2
                        50%


                        40%
                                                                         Random Model
                        30%
                                                                         Perfect Model

                        20%                                              dActivity_class_increase_RCFP_2 Model

                                                                         dActivity_class_increase_RCFP_4 Model
                        10%
                                                                         dActivity_class_increase_RCFP_6 Model

                         0%
                               0%   10%   20%   30%   40%       50%      60%     70%       80%      90%      100%
                                                            % of Samples
Confidence vs. pIC50




pIC50




                 Bayesian score = confidence
Semi-quantitative Bayesian predictions
                             • Multi-category Bayesian
                             • Class = pIC50 bin
                             • RCFP_6




                              Compare:
                              • Normalised Probability (default)
                              • #Enrichment
                              • #EstPGood
                              • Prediction
#EstPGood score smallest prediction error
22.5%
                         22.5%




30.0%                     19%
MMP vs. Full molecule transformations


                               vs.




22.5%                                 30.0%




        Modelling with mapped reactions works better (it should)
MMP Idea Generator: Training




• 80% training set
  – Generate MMP transformations
  – Learn classic regression model (PLS)
  – Learn Bayesian model from reaction fingerprints
MMP Idea Generator: Test


          ~34k transformations     >6.5M design ideas




 Test set


    • ~5.6 predictions per test set molecule
    • MMP pIC50 := mean (pIC50reactant + pIC50transformation)
    • RCFP pIC50 := mean (pIC50reactant + pIC50predicted by Bayes)
Runtime ~ 30 min
QSAR by MMP
QSAR by Bayes / RCFP_6
SAR by MMP vs. SAR by PLS
                                            ECFP_6 / phys property descriptors




                             MMP                                                 PLS




    • MMP predictions nearly as good as PLS predictions
    • Not 100% like with like comparison: fewer predictions for MMP
Consensus MMP & PLS predictions


           Found by MMP: 11 / 56               Consensus: 26 / 62



                                               Found by PLS: 10 / 56




                              12 / 1006




       Red: top 5% by pIC50 (59)
       Solid: top 10% (118) by MMP or PLS. Total = 174
Conclusions

• For one dataset it has been shown that
  – MMP transformations can form basis of an
    automatic “Learning Machine”
  – Can select “significant rules”
  – Consensus MMP/regresssion activity prediction
    works better than individual predictions
Spares
MMP vs. Bayes/RCFP predictions

More Related Content

Viewers also liked

Resume - Greg Karolchik
Resume - Greg KarolchikResume - Greg Karolchik
Resume - Greg KarolchikGreg Karolchik
 
Ash Nallawalla Presentation at Pubcon 2015
Ash Nallawalla Presentation at Pubcon 2015Ash Nallawalla Presentation at Pubcon 2015
Ash Nallawalla Presentation at Pubcon 2015DFWSEM
 
Dailymotion
DailymotionDailymotion
DailymotionWDPM
 
Avis du Haut conseil pour l’avenir de l’assurance maladie - Innovations et sy...
Avis du Haut conseil pour l’avenir de l’assurance maladie - Innovations et sy...Avis du Haut conseil pour l’avenir de l’assurance maladie - Innovations et sy...
Avis du Haut conseil pour l’avenir de l’assurance maladie - Innovations et sy...France Stratégie
 
Os novos desafios para a implantação de programas de gestão arquivística de d...
Os novos desafios para a implantação de programas de gestão arquivística de d...Os novos desafios para a implantação de programas de gestão arquivística de d...
Os novos desafios para a implantação de programas de gestão arquivística de d...Daniel Flores
 
Celebrity plastic surgery before and after
Celebrity plastic surgery before and afterCelebrity plastic surgery before and after
Celebrity plastic surgery before and afterThebooth fairy
 

Viewers also liked (10)

Resume - Greg Karolchik
Resume - Greg KarolchikResume - Greg Karolchik
Resume - Greg Karolchik
 
JetFuel
JetFuelJetFuel
JetFuel
 
Ash Nallawalla Presentation at Pubcon 2015
Ash Nallawalla Presentation at Pubcon 2015Ash Nallawalla Presentation at Pubcon 2015
Ash Nallawalla Presentation at Pubcon 2015
 
Dailymotion
DailymotionDailymotion
Dailymotion
 
Avis du Haut conseil pour l’avenir de l’assurance maladie - Innovations et sy...
Avis du Haut conseil pour l’avenir de l’assurance maladie - Innovations et sy...Avis du Haut conseil pour l’avenir de l’assurance maladie - Innovations et sy...
Avis du Haut conseil pour l’avenir de l’assurance maladie - Innovations et sy...
 
CR Forum LEX
CR Forum LEXCR Forum LEX
CR Forum LEX
 
Os novos desafios para a implantação de programas de gestão arquivística de d...
Os novos desafios para a implantação de programas de gestão arquivística de d...Os novos desafios para a implantação de programas de gestão arquivística de d...
Os novos desafios para a implantação de programas de gestão arquivística de d...
 
The Road to 60 FPS
The Road to 60 FPSThe Road to 60 FPS
The Road to 60 FPS
 
Pei uts (2012)
Pei   uts  (2012)Pei   uts  (2012)
Pei uts (2012)
 
Celebrity plastic surgery before and after
Celebrity plastic surgery before and afterCelebrity plastic surgery before and after
Celebrity plastic surgery before and after
 

Recently uploaded

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusZilliz
 

Recently uploaded (20)

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 

Automatic Compound Design by Matched Molecular Pairs

  • 1. Automatic Compound Design by Matched Molecular Pairs Willem van Hoorn Senior Solutions Consultant Professional Services
  • 2. Contents • Matched Molecular Pairs (MMPs) • Implementation in PP • Reaction Fingerprints • Using MMPs as automatic learning machine
  • 3. Ceci n’est pas une MMP Similarity = 0.55 / 0.98 (ECFP_4 / MDL public keys)   Sildenafil Vardenafil MMP: - Single change - Typically: 1 or 2 bond cleavage; replace R-group or template
  • 4. Recent AZ review http://pubs.acs.org/doi/abs/10.1021/jm200452d
  • 5. MMP as predictor of activity Classic QSAR with full molecule descriptors QSAR using MMP pIC50(m-Br to m-Cl-p-F) = -0.19
  • 6. What have the MMPs done for us? Classic QSAR / regression • More generic, can predict >1 change • Interpretability varies MMPs • Can only predict “one step away from known” • Very interpretable • Can answer “what to make next” challenge
  • 8. Example of MMP learning machine 1  2 transformation applied to 4 compound 3 should yield more attractive compound 4
  • 9. MMP in Pipeline Pilot Components Protocols PP 8.5 CU1
  • 10. PP MMP algorithm based on GSK publication
  • 11. Test set: EGFR from ChEMBL - ChEMBL version 11 - 4609 IC50 values - 3581 compounds Ed Griffen et al J Med Chem. 2011, 54, 7739-50
  • 12. Generate MMPs and transformations >90k Slow! MMPs in <1 minute MMP output Full transformation MMP transformation
  • 13. pIC50 distribution of transformations 90,343 MMPs yield 180,684 transformations (AB / BA) bioisosters activity cliffs activity cliffs 10fold 100fold 1000fold etc
  • 14. MMP transformations vs. full reactions Not specific enough, seen >>1 in Too specific, seen once in dataset, data set but large stddev( pIC50) pIC50 statistics n=1 Would like to have something that describes “reaction centre + nearby environment” Would like increase confidence by looking at similar MMP transformations (with similar pIC50)
  • 15. PP reaction fingerprints: RCFP • RCFP are similar to ECFP, atoms described by:  Charge  Hybridization  Whether the atom is Reactant or Product  Whether or not the atom is in the “Reaction Site” • Need mapped reactions PP 8.5
  • 16. Reaction mapping is necessary Mapped Unmapped All features, no Only features describing information whether reaction site atom is in product or reactant
  • 17. Reaction direction matters Reaction fingerprints are not identical A→ B ≠ B → A
  • 18. MMP transformation as rules Context of MMP transformation “Rule” = MMP transformation Effect = pIC50
  • 19. Tanimoto seach of MMP transformations A single observation… pIC50 = 1.3 pIC50 = 1.9 pIC50 = 1.5 … becomes more believable when looking at similars pIC50 = 1.8
  • 20. Express significance as Bayesian probability Bayesian model “Good” molecules: pIC50 ≥ 1 Rank test set by likelihood transformation will yield ≥10fold increase in potency
  • 21. Bayes can predict MMP 10 fold increase Enrichment plots of test set 100% 90% 80% 70% • RCFP_6 > RCFP_4 % Actives Captured 60% • RCFP_4 >> RCFP_2 50% 40% Random Model 30% Perfect Model 20% dActivity_class_increase_RCFP_2 Model dActivity_class_increase_RCFP_4 Model 10% dActivity_class_increase_RCFP_6 Model 0% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% % of Samples
  • 22. Confidence vs. pIC50 pIC50 Bayesian score = confidence
  • 23. Semi-quantitative Bayesian predictions • Multi-category Bayesian • Class = pIC50 bin • RCFP_6 Compare: • Normalised Probability (default) • #Enrichment • #EstPGood • Prediction
  • 24. #EstPGood score smallest prediction error 22.5% 22.5% 30.0% 19%
  • 25. MMP vs. Full molecule transformations vs. 22.5% 30.0% Modelling with mapped reactions works better (it should)
  • 26. MMP Idea Generator: Training • 80% training set – Generate MMP transformations – Learn classic regression model (PLS) – Learn Bayesian model from reaction fingerprints
  • 27. MMP Idea Generator: Test ~34k transformations >6.5M design ideas Test set • ~5.6 predictions per test set molecule • MMP pIC50 := mean (pIC50reactant + pIC50transformation) • RCFP pIC50 := mean (pIC50reactant + pIC50predicted by Bayes) Runtime ~ 30 min
  • 29. QSAR by Bayes / RCFP_6
  • 30. SAR by MMP vs. SAR by PLS ECFP_6 / phys property descriptors MMP PLS • MMP predictions nearly as good as PLS predictions • Not 100% like with like comparison: fewer predictions for MMP
  • 31. Consensus MMP & PLS predictions Found by MMP: 11 / 56 Consensus: 26 / 62 Found by PLS: 10 / 56 12 / 1006 Red: top 5% by pIC50 (59) Solid: top 10% (118) by MMP or PLS. Total = 174
  • 32. Conclusions • For one dataset it has been shown that – MMP transformations can form basis of an automatic “Learning Machine” – Can select “significant rules” – Consensus MMP/regresssion activity prediction works better than individual predictions
  • 33.
  • 35. MMP vs. Bayes/RCFP predictions

Editor's Notes

  1. Cecin’est pas unepaire de moléculairesappariés.
  2. Note: duplicates! Same transformation (but with different common core). Same transformation can have different delta(pIC50).
  3. For the eagle-eyed: note that the transformation on the left is not derived from the pair on the right
  4. X-axis: test set reactions sorted by Bayesian score i.e. likelihood that they increase potency at least 10fold. Y-axis: Retrieval rate (pct)e of true reactions that increase potency at least 10fold (i.e. true positives)Random model: screen random 50%, find 50% of true positives. This is the line of decencyPerfect model: Hypothetical model that first identifies all true positives, then the rest.In between random and perfect are the real models. This is typical graph for a good Bayesian model.
  5. X-axis: difference between predicted pIC50 bin and real pIC50 bin. The smaller the better
  6. Could filter out low confidence transformations from the 24k set. Could also remove ones that add too much Mw, etcMany of the design ideas are duplicates or invalid structures
  7. Need to investigate this…Also note larger deviations at low activity range.