SlideShare a Scribd company logo
1 of 32
Download to read offline
David Evans and George Papadatos
Lilly Research Centre, Erl Wood Manor, Windlesham,
                         UK

          22nd September 2011
• Discover new chemotypes
• Multiobjective space
 •   Isosteres in activity
 •   Improvements in properties

• Want to use multiple tools in same
  environment
 •   But understand what works when
•       Open Source Workflow tool – main client is free

•       But support is available and can integrate commercial vendors + in-
        house code as nodes

•       Have released many Erl Wood nodes to KNIME community site
    •     http://tech.knime.org/community/erlwood
FieldAlign
                    Xedmin
                     Xedex

Xedmin
•XED minimization               FieldView
•2D -> 3D                       •Launches FieldView
                                •View field points +
Xedex                           energies + other data
• Conformational
analysis                 All nodes pass SDF
FieldAlign
• Flexible alignment
of query molecules
onto template
WHY                                         ?
Process is more than just the database search




               Company Confidential
               Copyright © 2008 Eli Lilly and Company
Don’t want
                             to load all
                             databases
        + secure              onto all
        intranet !
                             users’ PCs
                 SOAP Web
Command-           Service
line search       •Apache         node
                   Tomcat
    Platform-independent communication
Non-proprietary structure     • Read in pre-built hypothesis
                                     (MOE, Phase)

                            • Or sketch from template molecule

                                 • Jmol based visualizer

                            • Can also annotate and filter hits,
                                 aids manual inspection
How well do automated pharmacophore
methods do compared to 2D methods?
Maximum Unbiased Validation (MUV) dataset

• 17 targets, total 30 ligands and 15000 decoys per target,
source: PubChem bioactivity data.

• Wide-ranging targets: hormone receptors, kinases, proteases,
GPCRs plus others (e.g. HSP90, HIV RT).

• Unbiased for chemical analogues as MUV ligands pre-
clustered with 2D fingerprint
    •1.16 compounds per scaffold class

                             MUV: J. Chem. Inf. Model., 2009, 49 (2), 169-184.
• Have looked at whole molecule similarity

• Is there more data if we find fragments which maintain
activity?

• Matched Molecular pair analysis (MMP)
•   Fragments compounds and finds pairs where only one fragment differs
The mining and statistical analysis of transformations and
their impact on properties of interest (e.g. solubility or
activity)
  left molecule   right molecule   transformation    ΔSolubility (mgml)



                                      HF                 -0.8



                                    Br  OCH3             +1.2




                                                          +2.4
(*in an automated and unsupervised way)
  It used to be a slow and computationally expensive
  process...
   •   Pair-wise maximum common substructure extraction – O(N2)

  Recently a much more efficient algorithm was published
       1) Cleave all acyclic single
            bonds, one by one:

        2) Index all the fragments (cf. book index):


        3) Enumerate the values for each
                       key:                                                                                       >> *
                                                                             Mol A >> Mol B *
  Hussain and Rea (2010). J. Chem. Inf. and Model., 50 (3), 339-348.   Wagener and Lommerse (2006). J. Chem. Inf. and Model., 46 (2), 677-685.
In: MolRegnos (IDs), structures (in RDKit format) and
property values

Out: Matched pairs (left and right molecule, IDs,
transformation, property values, ΔP, context,
transformation atom count)

Available as an Erl Wood community contribution node
Find isosteres in chEMBL
chEMBL
       – Database of published medicinal chemistry activity data
       – Using chEMBL_10 , total >1,000,000 compounds


Use here just human protein kinase inhibitors
Quality assurance for chEMBL data (SQL statement)
   •     Med. chem. friendly compounds, parent structure, not downgraded,
         confidence score = 9, exact IC50 or Ki values only (converted to
         pIC50/pKi)  ~14K data points
   •     Compare biological values coming from the same assay ID only

Aggregate transformations; calculate and bin ΔpIC50s in
3 bins
   •     Good – Bad – Neutral(depending on a cut-off c = 0.4 log units)
• Each
transformation
 has a neutral
     count

• Absolute value
 or percentage:

NeutralCount%
chEMBL workflow outputs isosteric fragments

                              How similar are
                              isosteres in 2D
                            fingerprint space?

                              In field space?

                            Could fields help us
                             find unexpected
                                isosteres?
• 1802 fragment pairs from chEMBL_10 kinase data set

• 481 with no rotatable bonds left or right
    • Simplifies conformational analysis

• For each fragment pair
   1. Swap attachment points for adamantyl
   2. FieldAlign to get field similarity (Use adamantyl to
       constrain overlay)
   3. RDKit fingerprint similarity – topological Daylight-esque
   4. Correct similarities for adamantyl

• Are there isosteric pairs with high field similarity but low RDKit
similarity?
Size by Neutral
Field            Count %
Sim            Larger more
                 isosteric


               Pairs with high
               field similarity
                 but low 2D
                  similarity

               Pairs with high
                field and 2D
                  similarity
        RDKit Sim
Size by Neutral
Field            Count %
Sim
              Only those with
              >60% isosteric
                examples


             Thiophene -> Phenol




        RDKit Sim
Size by Neutral
Field            Count %
Sim
              Only those with
              >60% isosteric
                examples


               Imidazole->
               Morpholine?




        RDKit Sim
Size by Neutral
Field            Count %
Sim
              Only those with
              >60% isosteric
                examples


                Some small
                 fragments




        RDKit Sim
Non-proprietary structure
(from PDB)
                            WEE1
                            kinase
                            PDB 2I06



                                   Buried




                               Solvent-
                               exposed
Size by Neutral
Field            Count %
Sim
              Only those with
              >60% isosteric
                examples


             Me-tetrazole ->
              oxadiazole




        RDKit Sim
Size by Neutral
Field            Count %
Sim
              Only those with
              >60% isosteric
                examples

              Thiophene ->
                 phenol




        RDKit Sim
Non-proprietary structure
(from PDB)
• 6299 data points from thermodynamic solubility assay

• 423 single-point transformations

• 215 no-rotatable point transformations

• Aggregate transformations; calculate and bin ΔlogS in 3 bins
   • Good – Bad – Neutral (c = 0.3 log units)



• Are there transformations which increase solubility with low
field similarity but high RDKit similarity?
Size by Good
Field            Count %
Sim
              Only those with
              >60% boosting
                examples



              Ring contraction
                 + twist ?




        RDKit Sim
Size by Good
Field            Count %
Sim
              Only those with
              >60% boosting
                examples



             Big boost from
              morpholine




        RDKit Sim
• Can mine chEMBL data for non-obvious isosteres
   • Will other data sets find more?
• Would like to improve workflow to make isostere data set for
3D similarity comparison
   • Improve fragmentation/conformer/ alignment handling?
   • Need to include whole molecule?
   • Need 3D binding site data as well to confirm isosterism?

• KNIME platform developing
   • Virtual screening and evaluation environment
   • Rapid experimentation with varied tools
   • http://tech.knime.org/community/erlwood
George Papadatos
      Juliette Pradon
        Hina Patel
     Nikolas Fechner
      David Thorner
     Michael Bodkin


KNIME, chEMBL + Cresset !
ROC curves for
retrieval of >66%
 isosteric groups

Field similarity
performs better
  than RDKit

But AUC = 0.68

 Workflow not
 optimized for
 this purpose

More Related Content

Similar to David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'

Scaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - SematextScaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - SematextRafał Kuć
 
Gordon2003
Gordon2003Gordon2003
Gordon2003toluene
 
A machine-learning view on heterogeneous catalyst design and discovery
A machine-learning view on heterogeneous catalyst design and discoveryA machine-learning view on heterogeneous catalyst design and discovery
A machine-learning view on heterogeneous catalyst design and discoveryIchigaku Takigawa
 
Energy Aware performance evaluation of WSNs.
Energy Aware performance evaluation of WSNs.Energy Aware performance evaluation of WSNs.
Energy Aware performance evaluation of WSNs.ikrrish
 
Presentation l`aquila new
Presentation l`aquila newPresentation l`aquila new
Presentation l`aquila newikrrish
 
Exploring Spark for Scalable Metagenomics Analysis: Spark Summit East talk by...
Exploring Spark for Scalable Metagenomics Analysis: Spark Summit East talk by...Exploring Spark for Scalable Metagenomics Analysis: Spark Summit East talk by...
Exploring Spark for Scalable Metagenomics Analysis: Spark Summit East talk by...Spark Summit
 
PipeTune: Pipeline Parallelism of Hyper and System Parameters Tuning for Deep...
PipeTune: Pipeline Parallelism of Hyper and System Parameters Tuning for Deep...PipeTune: Pipeline Parallelism of Hyper and System Parameters Tuning for Deep...
PipeTune: Pipeline Parallelism of Hyper and System Parameters Tuning for Deep...LEGATO project
 
Clustering the royal society of chemistry chemical repository to enable enhan...
Clustering the royal society of chemistry chemical repository to enable enhan...Clustering the royal society of chemistry chemical repository to enable enhan...
Clustering the royal society of chemistry chemical repository to enable enhan...Valery Tkachenko
 
Cheminfo Stories APAC 2020 - Chemical Descriptors & Standardizers for Machine...
Cheminfo Stories APAC 2020 - Chemical Descriptors & Standardizers for Machine...Cheminfo Stories APAC 2020 - Chemical Descriptors & Standardizers for Machine...
Cheminfo Stories APAC 2020 - Chemical Descriptors & Standardizers for Machine...ChemAxon
 
Imac10 Component
Imac10 ComponentImac10 Component
Imac10 ComponentSDTools
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomicsAjit Shinde
 
Overview of DuraMat software tool development
Overview of DuraMat software tool developmentOverview of DuraMat software tool development
Overview of DuraMat software tool developmentAnubhav Jain
 
SFDC Introduction to Apex
SFDC Introduction to ApexSFDC Introduction to Apex
SFDC Introduction to ApexSujit Kumar
 
04 accelerating dl inference with (open)capi and posit numbers
04 accelerating dl inference with (open)capi and posit numbers04 accelerating dl inference with (open)capi and posit numbers
04 accelerating dl inference with (open)capi and posit numbersYutaka Kawai
 
Metabolic network mapping for metabolomics
Metabolic network mapping for metabolomicsMetabolic network mapping for metabolomics
Metabolic network mapping for metabolomicsDinesh Barupal
 
Machine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackboxMachine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackboxIvo Andreev
 

Similar to David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs' (20)

Scaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - SematextScaling massive elastic search clusters - Rafał Kuć - Sematext
Scaling massive elastic search clusters - Rafał Kuć - Sematext
 
PraveenBOUT++
PraveenBOUT++PraveenBOUT++
PraveenBOUT++
 
New
NewNew
New
 
Gordon2003
Gordon2003Gordon2003
Gordon2003
 
A machine-learning view on heterogeneous catalyst design and discovery
A machine-learning view on heterogeneous catalyst design and discoveryA machine-learning view on heterogeneous catalyst design and discovery
A machine-learning view on heterogeneous catalyst design and discovery
 
Energy Aware performance evaluation of WSNs.
Energy Aware performance evaluation of WSNs.Energy Aware performance evaluation of WSNs.
Energy Aware performance evaluation of WSNs.
 
Presentation l`aquila new
Presentation l`aquila newPresentation l`aquila new
Presentation l`aquila new
 
Exploring Spark for Scalable Metagenomics Analysis: Spark Summit East talk by...
Exploring Spark for Scalable Metagenomics Analysis: Spark Summit East talk by...Exploring Spark for Scalable Metagenomics Analysis: Spark Summit East talk by...
Exploring Spark for Scalable Metagenomics Analysis: Spark Summit East talk by...
 
PipeTune: Pipeline Parallelism of Hyper and System Parameters Tuning for Deep...
PipeTune: Pipeline Parallelism of Hyper and System Parameters Tuning for Deep...PipeTune: Pipeline Parallelism of Hyper and System Parameters Tuning for Deep...
PipeTune: Pipeline Parallelism of Hyper and System Parameters Tuning for Deep...
 
Clustering the royal society of chemistry chemical repository to enable enhan...
Clustering the royal society of chemistry chemical repository to enable enhan...Clustering the royal society of chemistry chemical repository to enable enhan...
Clustering the royal society of chemistry chemical repository to enable enhan...
 
Cheminfo Stories APAC 2020 - Chemical Descriptors & Standardizers for Machine...
Cheminfo Stories APAC 2020 - Chemical Descriptors & Standardizers for Machine...Cheminfo Stories APAC 2020 - Chemical Descriptors & Standardizers for Machine...
Cheminfo Stories APAC 2020 - Chemical Descriptors & Standardizers for Machine...
 
Imac10 Component
Imac10 ComponentImac10 Component
Imac10 Component
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomics
 
Overview of DuraMat software tool development
Overview of DuraMat software tool developmentOverview of DuraMat software tool development
Overview of DuraMat software tool development
 
SFDC Introduction to Apex
SFDC Introduction to ApexSFDC Introduction to Apex
SFDC Introduction to Apex
 
Bioinformatica t4-alignments
Bioinformatica t4-alignmentsBioinformatica t4-alignments
Bioinformatica t4-alignments
 
04 accelerating dl inference with (open)capi and posit numbers
04 accelerating dl inference with (open)capi and posit numbers04 accelerating dl inference with (open)capi and posit numbers
04 accelerating dl inference with (open)capi and posit numbers
 
Metabolic network mapping for metabolomics
Metabolic network mapping for metabolomicsMetabolic network mapping for metabolomics
Metabolic network mapping for metabolomics
 
Machine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackboxMachine learning for IoT - unpacking the blackbox
Machine learning for IoT - unpacking the blackbox
 
Point GEODES
Point GEODESPoint GEODES
Point GEODES
 

More from Cresset

Selectivity mining – multiple activities in Activity Miner
Selectivity mining – multiple activities in Activity MinerSelectivity mining – multiple activities in Activity Miner
Selectivity mining – multiple activities in Activity MinerCresset
 
Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...
Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...
Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...Cresset
 
Can field based chemistry help us to predict protein-DNA binding sites?
Can field based chemistry help us to predict protein-DNA binding sites?Can field based chemistry help us to predict protein-DNA binding sites?
Can field based chemistry help us to predict protein-DNA binding sites?Cresset
 
Organic converstions: an aid in perspective
Organic converstions: an aid in perspectiveOrganic converstions: an aid in perspective
Organic converstions: an aid in perspectiveCresset
 
Identification of novel potential anti cancer agents using network pharmacolo...
Identification of novel potential anti cancer agents using network pharmacolo...Identification of novel potential anti cancer agents using network pharmacolo...
Identification of novel potential anti cancer agents using network pharmacolo...Cresset
 
Knowledge-based chemical fragment analysis in protein binding sites
Knowledge-based chemical fragment analysis in protein binding sitesKnowledge-based chemical fragment analysis in protein binding sites
Knowledge-based chemical fragment analysis in protein binding sitesCresset
 
Using waterswap to predict and understand binding affinities
Using waterswap to predict and understand binding affinitiesUsing waterswap to predict and understand binding affinities
Using waterswap to predict and understand binding affinitiesCresset
 
Smart drug re-profiling using computational chemistry tools novel biology and...
Smart drug re-profiling using computational chemistry tools novel biology and...Smart drug re-profiling using computational chemistry tools novel biology and...
Smart drug re-profiling using computational chemistry tools novel biology and...Cresset
 
New features in cresst products
New features in cresst productsNew features in cresst products
New features in cresst productsCresset
 
Comparing the electrostatic properties of protein active sites and other cres...
Comparing the electrostatic properties of protein active sites and other cres...Comparing the electrostatic properties of protein active sites and other cres...
Comparing the electrostatic properties of protein active sites and other cres...Cresset
 
Torch for medicinal chemists
Torch for medicinal chemistsTorch for medicinal chemists
Torch for medicinal chemistsCresset
 
Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...
Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...
Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...Cresset
 
Smart drug re profiling using computational chemistry tools novel biology and...
Smart drug re profiling using computational chemistry tools novel biology and...Smart drug re profiling using computational chemistry tools novel biology and...
Smart drug re profiling using computational chemistry tools novel biology and...Cresset
 
Intelligent library design for protein families and beyond sp
Intelligent library design for protein families and beyond spIntelligent library design for protein families and beyond sp
Intelligent library design for protein families and beyond spCresset
 
Intelligent library design for protein families and beyond sp
Intelligent library design for protein families and beyond spIntelligent library design for protein families and beyond sp
Intelligent library design for protein families and beyond spCresset
 
Finding and using activity cliffs in 3D: Gaining more SAR information during ...
Finding and using activity cliffs in 3D: Gaining more SAR information during ...Finding and using activity cliffs in 3D: Gaining more SAR information during ...
Finding and using activity cliffs in 3D: Gaining more SAR information during ...Cresset
 
Tim Cheeseright, Assessing the Similarities of Compound collections using mol...
Tim Cheeseright, Assessing the Similarities of Compound collections using mol...Tim Cheeseright, Assessing the Similarities of Compound collections using mol...
Tim Cheeseright, Assessing the Similarities of Compound collections using mol...Cresset
 
Cresset: 25 year of Fields
Cresset: 25 year of FieldsCresset: 25 year of Fields
Cresset: 25 year of FieldsCresset
 
Mark Mackey, Cresset, 'Meet Molecular Architect, A new product for understand...
Mark Mackey, Cresset, 'Meet Molecular Architect, A new product for understand...Mark Mackey, Cresset, 'Meet Molecular Architect, A new product for understand...
Mark Mackey, Cresset, 'Meet Molecular Architect, A new product for understand...Cresset
 
Tim Cheeseright, Cresset, 'Introducing Fragment Growing in FieldStere and oth...
Tim Cheeseright, Cresset, 'Introducing Fragment Growing in FieldStere and oth...Tim Cheeseright, Cresset, 'Introducing Fragment Growing in FieldStere and oth...
Tim Cheeseright, Cresset, 'Introducing Fragment Growing in FieldStere and oth...Cresset
 

More from Cresset (20)

Selectivity mining – multiple activities in Activity Miner
Selectivity mining – multiple activities in Activity MinerSelectivity mining – multiple activities in Activity Miner
Selectivity mining – multiple activities in Activity Miner
 
Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...
Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...
Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...
 
Can field based chemistry help us to predict protein-DNA binding sites?
Can field based chemistry help us to predict protein-DNA binding sites?Can field based chemistry help us to predict protein-DNA binding sites?
Can field based chemistry help us to predict protein-DNA binding sites?
 
Organic converstions: an aid in perspective
Organic converstions: an aid in perspectiveOrganic converstions: an aid in perspective
Organic converstions: an aid in perspective
 
Identification of novel potential anti cancer agents using network pharmacolo...
Identification of novel potential anti cancer agents using network pharmacolo...Identification of novel potential anti cancer agents using network pharmacolo...
Identification of novel potential anti cancer agents using network pharmacolo...
 
Knowledge-based chemical fragment analysis in protein binding sites
Knowledge-based chemical fragment analysis in protein binding sitesKnowledge-based chemical fragment analysis in protein binding sites
Knowledge-based chemical fragment analysis in protein binding sites
 
Using waterswap to predict and understand binding affinities
Using waterswap to predict and understand binding affinitiesUsing waterswap to predict and understand binding affinities
Using waterswap to predict and understand binding affinities
 
Smart drug re-profiling using computational chemistry tools novel biology and...
Smart drug re-profiling using computational chemistry tools novel biology and...Smart drug re-profiling using computational chemistry tools novel biology and...
Smart drug re-profiling using computational chemistry tools novel biology and...
 
New features in cresst products
New features in cresst productsNew features in cresst products
New features in cresst products
 
Comparing the electrostatic properties of protein active sites and other cres...
Comparing the electrostatic properties of protein active sites and other cres...Comparing the electrostatic properties of protein active sites and other cres...
Comparing the electrostatic properties of protein active sites and other cres...
 
Torch for medicinal chemists
Torch for medicinal chemistsTorch for medicinal chemists
Torch for medicinal chemists
 
Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...
Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...
Discovery and optimization of novel small molecule HIV-1 entry inhibitors usi...
 
Smart drug re profiling using computational chemistry tools novel biology and...
Smart drug re profiling using computational chemistry tools novel biology and...Smart drug re profiling using computational chemistry tools novel biology and...
Smart drug re profiling using computational chemistry tools novel biology and...
 
Intelligent library design for protein families and beyond sp
Intelligent library design for protein families and beyond spIntelligent library design for protein families and beyond sp
Intelligent library design for protein families and beyond sp
 
Intelligent library design for protein families and beyond sp
Intelligent library design for protein families and beyond spIntelligent library design for protein families and beyond sp
Intelligent library design for protein families and beyond sp
 
Finding and using activity cliffs in 3D: Gaining more SAR information during ...
Finding and using activity cliffs in 3D: Gaining more SAR information during ...Finding and using activity cliffs in 3D: Gaining more SAR information during ...
Finding and using activity cliffs in 3D: Gaining more SAR information during ...
 
Tim Cheeseright, Assessing the Similarities of Compound collections using mol...
Tim Cheeseright, Assessing the Similarities of Compound collections using mol...Tim Cheeseright, Assessing the Similarities of Compound collections using mol...
Tim Cheeseright, Assessing the Similarities of Compound collections using mol...
 
Cresset: 25 year of Fields
Cresset: 25 year of FieldsCresset: 25 year of Fields
Cresset: 25 year of Fields
 
Mark Mackey, Cresset, 'Meet Molecular Architect, A new product for understand...
Mark Mackey, Cresset, 'Meet Molecular Architect, A new product for understand...Mark Mackey, Cresset, 'Meet Molecular Architect, A new product for understand...
Mark Mackey, Cresset, 'Meet Molecular Architect, A new product for understand...
 
Tim Cheeseright, Cresset, 'Introducing Fragment Growing in FieldStere and oth...
Tim Cheeseright, Cresset, 'Introducing Fragment Growing in FieldStere and oth...Tim Cheeseright, Cresset, 'Introducing Fragment Growing in FieldStere and oth...
Tim Cheeseright, Cresset, 'Introducing Fragment Growing in FieldStere and oth...
 

Recently uploaded

The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 

Recently uploaded (20)

The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 

David Evans, Eli-Lilly, 'Field-Aligned Matched Pairs'

  • 1. David Evans and George Papadatos Lilly Research Centre, Erl Wood Manor, Windlesham, UK 22nd September 2011
  • 2. • Discover new chemotypes • Multiobjective space • Isosteres in activity • Improvements in properties • Want to use multiple tools in same environment • But understand what works when
  • 3. Open Source Workflow tool – main client is free • But support is available and can integrate commercial vendors + in- house code as nodes • Have released many Erl Wood nodes to KNIME community site • http://tech.knime.org/community/erlwood
  • 4. FieldAlign Xedmin Xedex Xedmin •XED minimization FieldView •2D -> 3D •Launches FieldView •View field points + Xedex energies + other data • Conformational analysis All nodes pass SDF
  • 5. FieldAlign • Flexible alignment of query molecules onto template
  • 6. WHY ? Process is more than just the database search Company Confidential Copyright © 2008 Eli Lilly and Company
  • 7. Don’t want to load all databases + secure onto all intranet ! users’ PCs SOAP Web Command- Service line search •Apache node Tomcat Platform-independent communication
  • 8. Non-proprietary structure • Read in pre-built hypothesis (MOE, Phase) • Or sketch from template molecule • Jmol based visualizer • Can also annotate and filter hits, aids manual inspection
  • 9. How well do automated pharmacophore methods do compared to 2D methods? Maximum Unbiased Validation (MUV) dataset • 17 targets, total 30 ligands and 15000 decoys per target, source: PubChem bioactivity data. • Wide-ranging targets: hormone receptors, kinases, proteases, GPCRs plus others (e.g. HSP90, HIV RT). • Unbiased for chemical analogues as MUV ligands pre- clustered with 2D fingerprint •1.16 compounds per scaffold class MUV: J. Chem. Inf. Model., 2009, 49 (2), 169-184.
  • 10.
  • 11. • Have looked at whole molecule similarity • Is there more data if we find fragments which maintain activity? • Matched Molecular pair analysis (MMP) • Fragments compounds and finds pairs where only one fragment differs
  • 12. The mining and statistical analysis of transformations and their impact on properties of interest (e.g. solubility or activity) left molecule right molecule transformation ΔSolubility (mgml) HF -0.8 Br  OCH3 +1.2 +2.4
  • 13. (*in an automated and unsupervised way) It used to be a slow and computationally expensive process... • Pair-wise maximum common substructure extraction – O(N2) Recently a much more efficient algorithm was published 1) Cleave all acyclic single bonds, one by one: 2) Index all the fragments (cf. book index): 3) Enumerate the values for each key: >> * Mol A >> Mol B * Hussain and Rea (2010). J. Chem. Inf. and Model., 50 (3), 339-348. Wagener and Lommerse (2006). J. Chem. Inf. and Model., 46 (2), 677-685.
  • 14. In: MolRegnos (IDs), structures (in RDKit format) and property values Out: Matched pairs (left and right molecule, IDs, transformation, property values, ΔP, context, transformation atom count) Available as an Erl Wood community contribution node
  • 15. Find isosteres in chEMBL chEMBL – Database of published medicinal chemistry activity data – Using chEMBL_10 , total >1,000,000 compounds Use here just human protein kinase inhibitors Quality assurance for chEMBL data (SQL statement) • Med. chem. friendly compounds, parent structure, not downgraded, confidence score = 9, exact IC50 or Ki values only (converted to pIC50/pKi)  ~14K data points • Compare biological values coming from the same assay ID only Aggregate transformations; calculate and bin ΔpIC50s in 3 bins • Good – Bad – Neutral(depending on a cut-off c = 0.4 log units)
  • 16. • Each transformation has a neutral count • Absolute value or percentage: NeutralCount%
  • 17. chEMBL workflow outputs isosteric fragments How similar are isosteres in 2D fingerprint space? In field space? Could fields help us find unexpected isosteres?
  • 18. • 1802 fragment pairs from chEMBL_10 kinase data set • 481 with no rotatable bonds left or right • Simplifies conformational analysis • For each fragment pair 1. Swap attachment points for adamantyl 2. FieldAlign to get field similarity (Use adamantyl to constrain overlay) 3. RDKit fingerprint similarity – topological Daylight-esque 4. Correct similarities for adamantyl • Are there isosteric pairs with high field similarity but low RDKit similarity?
  • 19. Size by Neutral Field Count % Sim Larger more isosteric Pairs with high field similarity but low 2D similarity Pairs with high field and 2D similarity RDKit Sim
  • 20. Size by Neutral Field Count % Sim Only those with >60% isosteric examples Thiophene -> Phenol RDKit Sim
  • 21. Size by Neutral Field Count % Sim Only those with >60% isosteric examples Imidazole-> Morpholine? RDKit Sim
  • 22. Size by Neutral Field Count % Sim Only those with >60% isosteric examples Some small fragments RDKit Sim
  • 23. Non-proprietary structure (from PDB) WEE1 kinase PDB 2I06 Buried Solvent- exposed
  • 24. Size by Neutral Field Count % Sim Only those with >60% isosteric examples Me-tetrazole -> oxadiazole RDKit Sim
  • 25. Size by Neutral Field Count % Sim Only those with >60% isosteric examples Thiophene -> phenol RDKit Sim
  • 27. • 6299 data points from thermodynamic solubility assay • 423 single-point transformations • 215 no-rotatable point transformations • Aggregate transformations; calculate and bin ΔlogS in 3 bins • Good – Bad – Neutral (c = 0.3 log units) • Are there transformations which increase solubility with low field similarity but high RDKit similarity?
  • 28. Size by Good Field Count % Sim Only those with >60% boosting examples Ring contraction + twist ? RDKit Sim
  • 29. Size by Good Field Count % Sim Only those with >60% boosting examples Big boost from morpholine RDKit Sim
  • 30. • Can mine chEMBL data for non-obvious isosteres • Will other data sets find more? • Would like to improve workflow to make isostere data set for 3D similarity comparison • Improve fragmentation/conformer/ alignment handling? • Need to include whole molecule? • Need 3D binding site data as well to confirm isosterism? • KNIME platform developing • Virtual screening and evaluation environment • Rapid experimentation with varied tools • http://tech.knime.org/community/erlwood
  • 31. George Papadatos Juliette Pradon Hina Patel Nikolas Fechner David Thorner Michael Bodkin KNIME, chEMBL + Cresset !
  • 32. ROC curves for retrieval of >66% isosteric groups Field similarity performs better than RDKit But AUC = 0.68 Workflow not optimized for this purpose