SlideShare a Scribd company logo
1 of 31
Do Targets
Segregate?
Andrea Zaliani
Aim
A. Zaliani 9thICCS 2
• Bioinformaticians were able to segregate protein targets
by several means from 1D to 3D and 4D
• We have potent means to perform same analysis from
ligand standpoint:
o Fingerprint (e.g. 2D,3D, interactionFP, etc)
o Shape Descriptors
o Grid
• Do we appreciate their peculiarities?
• Would our structural knowledge grow, if we knew some
frequent target-directing structural pattern?
Start – Method
A. Zaliani 9thICCS 3
• Plenty of late work trying to link protein structures, functions
and cavities to ligands (and vice versa) through similarity
concepts
• I would here stress not new methods but what we have
already in our hands to boost ideas with couple of applications
with freely available software (like KNIME, R)
• FP = Do we appreciate their peculiarities enough?
• Can we look into statistical models? If yes, do we?
Different FingerPrint (FP) for different scopes
A. Zaliani 9thICCS 4
• Can FP explain us this? FP Type Tan-Distance
MW,LogP,HA(CDK)… 0.000
Layered(RDKit) 0.082
AtomPairs (RDKit) 0.098
Indigo(GGA) 0.190
Morgan(RDKit) 0.302
FeatMorgan(RDKit) 0.348
ErG* 0.375
Similarity ≈ 0.62-65
*N. Stiefl et al. JCIM.,(2006), 46(1)208; N. Stiefl et al. JCIM, (2006), 46(2)587
A. Zaliani 9thICCS 5
ErG = pharmacophore-fingerprint
Development of ErG (Extended reduced Graph), a 2D-pharmacophoric
similarity tool for virtual screening
ErG is much less substructure-dependent so that:
•Opens opportunities in library design (scaffold-hopping)
•Multiple-to-one correspondence of chemical substructures to
pharmacophoric patterns  ‘abstract’
•Similarity searching & ‘scaffold-hopping’ documented
•FP interpretable as each bit corresponds to the count of pharmacophore pair
distances in graph
•Atom types [6] generate pairs [21] x max_distance [15] = 315 bits
Graph
N
N
Ac
D
+
Ac
D
+
Hf
Ac
D
+
Hf
Ar Ar
Charge / H-Bonding
Hydrophobic endcaps
Abstract ring forms
Ac
D
+
Hf
Ar Ar
N
N
Ac
D
+
Ac
D
+
Hf
Ac
D
+
Hf
Ar Ar
Charge / H-Bonding
Hydrophobic endcaps
Abstract ring forms
Ac
D
+
Hf
Ar Ar
RDF vectorization
AcAcd1,AcAcd2,…,AcDod4,…,ArHfd4,…..,+-d15
Cpd_A,0, 0, …,1, …,1, …,0
Experiment plan - Dataset
A. Zaliani 9thICCS 6
• From a literature database select a relevant random
subset (ca.17K) literature compounds showing at least
one activity (pEx50>6) towards a precise target among
class families like GPCR-A, Kinases, Proteases or NHR
• Data are high quality in terms of consistency
• Less than 5% of entire Pharma Database of Evolvus
• To check homogeneity all vs. all similarity evaluation with
TanDistance under different FP…..
A. Zaliani 9thICCS 7
Liceptor Database
Targets Annotated
• GPCR’s
• Ion- Channels
• CNS Transporters
• Kinases
• Proteases
• Phosphatases
Client Proprietary Targets
Small Molecule Ligand Database Features
Liceptor database can be customized with client specified additional fields and
custom data annotation
• 3.2 Million Structures
• > 1000 Targets
• Global Patents
• Med Chem. Journals
• Data annotated from 1967
• Multiple Target Data
• 2D Structures
• Molecular Descriptors
• IC50 and Unified Values
• Therapeutic Indications
Pharmacophore-based FP better
A. Zaliani 9thICCS 8
RDKit FP RDKit Feature Morgan FP
Experiment plan - Dataset
A. Zaliani 9thICCS 9
Experiment plan – Classification Model
A. Zaliani 9thICCS 10
• Partition Tree model generated
• Platform (KIN, GPCRA, NHR, PROT) can be
predicted with 15 ErG distances only
• If shuffled on Y, models generated with ave errors
ranging 63-77% (100x)
• External predictions at 82,6%
Target Family Classification Model
A. Zaliani 9thICCS 11
Learn from missclassified
A. Zaliani 9thICCS 12
• 15 Distances enough to segregate 17K compounds
in four classes
• From model some insights can be extracted:
• Example KIN relevant features:
i. Presence of Ar-NH(OH) [DoArd1>0]
ii. Absence of a-aminoacid signature AcDod3
=0
iii. Need of AcArd3 >0 if i. applies or =1
6H-Benzo[c]chromen-6-one derivatives as selective ERβ agonists
Bioorganic & Medicinal Chemistry Letters 16, (6), 2006, Pages 1468-1472
Learn from missclassified
A. Zaliani 9thICCS 13
• 15 Distances enough to segregate 17K compounds
in four classes
• From model some insights can be extracted:
• Example KIN relevant features:
i. Presence of Ar-NH(OH) [DoArd1>0]
ii. Absence of a-aminoacid signature AcDod3
=0
iii. Need of AcArd3 >0 if i. applies or =1
Classification Model – What to learn
A. Zaliani 9thICCS 14
• 15 Distances enough to segregate 17K compounds
in four classes
• From model some insights can be extracted:
• PROTEASE Target relevant features:
i. Presence of AA signature AcDoD3
ii. Presence of AcArd3
iii. Absence/Presence of max 1 HfArd4
Classification Model – How do we use this
A. Zaliani 9thICCS 15
• We can try to use these as smarts query into PDB
http://www.pdb.org/pdb/search/advSearch.do
• PROTEASE Target relevant features:
i. Presence of AA signature AcDoD3
ii. Presence of AcArd3
iii. Presence of max 1 HfArd4
• Results of query after removal of non polypeptide,
solvents, chain duplicates
• 101 complexes of which 53% correct proteases
• If only i.&iii. Were used, then 1141 hits found with 738
protease complexes (65%) retrieved
Single Family Classification Models
• Each Target Family could also be modeled through
classification
• KNIME offers several functions for:
o Data preparation
o Training/Test split with stratification on population
o Data reduction performed with an exhaustive retrograde selection
o Cross-validation with 100X Leave-10%-out
o Shuffled-Y 100 classification models built for negative test
o Performance statistics given on 25% external test set
A. Zaliani 9thICCS 16
Classification Model – NHR
A. Zaliani 9thICCS 17
Classification Model – NHR
A. Zaliani 9thICCS 18
Ave. Distance Profiles
A. Zaliani 9thICCS 190,00 0,10 0,20 0,30 0,40 0,50 0,60 0,70 0,80 0,90 1,00
Classification Model – NHR
A. Zaliani 9thICCS 20
Classification Model – Kinase
Classification Model – Kinase
A. Zaliani 9thICCS 21
Ave. Distance
Profiles
A. Zaliani 9thICCS 22
Classification Model – Kinase
A. Zaliani 9thICCS 23
Classification Model – GPCRA
Classification Model – GPCRA
A. Zaliani 9thICCS 24
Ave. Distance Profiles
A. Zaliani 9thICCS 25
Classification Model – GPCRA
Lessons learned here
• QC-based database essential
• 2D Pharmacophoric FP approach is enough but has to be
“understood”
• Making FP less cryptic help understanding potentialities and
limits
• Targets do segregate. Ligands help us realizing this, the more
the more precise
• Pharmacophoric Graph Space is immensely less problematic
than chemical space
• Provocation: how big is graph space of IP?
A. Zaliani 9thICCS 26
Limitations
• Question: you find what you already know?
• Question: Do abstraction help us?
• Every FP method is ok, provided that teaches us
something
• Promiscuity reduction is not the only final aim (controlled
promiscuity might be a need)
• Graph distances might be too general
• 2D Pharmacophoric fingerprinting to be improved
A. Zaliani 9thICCS 27
Future work
• 3D distances (3Dtriangles) could easily implemented
• Combinations of ligand FP and cavity FP could be really a
breakthrough to have a grip on multi-pharmacology
• FP Weighting for atomic de-solvation contribution is, for
me, KEY
• Agonist/antagonist split
• pEX50 >6 will provide different pictures?
A. Zaliani 9thICCS 28
Acknowledgements
A. Zaliani 9thICCS 29
Prof. M. Berthold
Greg Landrum
Nik Stiefl
Aniket Ausekar, CEO
Vikram Palshikar
Rashmi Jain
Mike Bodkin
Appendix
A. Zaliani 9thICCS 30
A. Zaliani 9thICCS 31
Approach to Polypharmacology
• Pharmacophore target family mapping using Neural Networks (Kohonen)
• Cpds mapped together with annotated actives from different sources (MDDR, UBI, etc.)
• Clustering method to suggest pharmacophore similarity (Ext.Reduced Graphs fingerprint)
SOM Binary ErG on 9444 cpds with pIC50>8
pIC50_8_SOM8_8_1M_Z (x value)
0 1 2 3 4 5 6 7
0
1
2
3
4
5
6
7
Protease
GCPRa
Kinases
NHR
Transporter
Neuron 7,3
775cpds from
different families
N
N
S
O
O
N N 2425712
pIC50(PR)=8.79
N
Cl
N
N O
O
450207
pIC50(NPY_V)=8.79

More Related Content

Similar to Iccs9th - Do Protein Targets segregates?

Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Databricks
 
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
Dr. Haxel Consult
 
Prediction of pKa from chemical structure using free and open source tools
Prediction of pKa from chemical structure using free and open source toolsPrediction of pKa from chemical structure using free and open source tools
Prediction of pKa from chemical structure using free and open source tools
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
OpenDiscovery
OpenDiscoveryOpenDiscovery
OpenDiscovery
gwprice
 
Activities at the Royal Society of Chemistry to gather, extract and analyze b...
Activities at the Royal Society of Chemistry to gather, extract and analyze b...Activities at the Royal Society of Chemistry to gather, extract and analyze b...
Activities at the Royal Society of Chemistry to gather, extract and analyze b...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 

Similar to Iccs9th - Do Protein Targets segregates? (20)

Mining 'Bigger' Datasets to Create, Validate and Share Machine Learning Models
Mining 'Bigger' Datasets to Create, Validate and Share Machine Learning ModelsMining 'Bigger' Datasets to Create, Validate and Share Machine Learning Models
Mining 'Bigger' Datasets to Create, Validate and Share Machine Learning Models
 
2019 Project Showcase - Alexander Adam Laurence
2019 Project Showcase - Alexander Adam Laurence2019 Project Showcase - Alexander Adam Laurence
2019 Project Showcase - Alexander Adam Laurence
 
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
Building a Knowledge Graph with Spark and NLP: How We Recommend Novel Drugs t...
 
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
 
Prediction of pKa from chemical structure using free and open source tools
Prediction of pKa from chemical structure using free and open source toolsPrediction of pKa from chemical structure using free and open source tools
Prediction of pKa from chemical structure using free and open source tools
 
May 15 workshop
May 15  workshopMay 15  workshop
May 15 workshop
 
OpenDiscovery
OpenDiscoveryOpenDiscovery
OpenDiscovery
 
May workshop
May workshopMay workshop
May workshop
 
SAFE: Policy Aware SPARQL Query Federation Over RDF Data Cubes
SAFE: Policy Aware SPARQL Query Federation Over RDF Data CubesSAFE: Policy Aware SPARQL Query Federation Over RDF Data Cubes
SAFE: Policy Aware SPARQL Query Federation Over RDF Data Cubes
 
2020.04.07 automated molecular design and the bradshaw platform webinar
2020.04.07 automated molecular design and the bradshaw platform webinar2020.04.07 automated molecular design and the bradshaw platform webinar
2020.04.07 automated molecular design and the bradshaw platform webinar
 
Neo4j_Cypher.pdf
Neo4j_Cypher.pdfNeo4j_Cypher.pdf
Neo4j_Cypher.pdf
 
The Progress on Sagace and Data Integration
The Progress on Sagace and Data IntegrationThe Progress on Sagace and Data Integration
The Progress on Sagace and Data Integration
 
Activities at the Royal Society of Chemistry to gather, extract and analyze b...
Activities at the Royal Society of Chemistry to gather, extract and analyze b...Activities at the Royal Society of Chemistry to gather, extract and analyze b...
Activities at the Royal Society of Chemistry to gather, extract and analyze b...
 
Too good to be true? How validate your data
Too good to be true? How validate your dataToo good to be true? How validate your data
Too good to be true? How validate your data
 
The importance of data curation on QSAR Modeling: PHYSPROP open data as a cas...
The importance of data curation on QSAR Modeling: PHYSPROP open data as a cas...The importance of data curation on QSAR Modeling: PHYSPROP open data as a cas...
The importance of data curation on QSAR Modeling: PHYSPROP open data as a cas...
 
Introduction to Galaxy and RNA-Seq
Introduction to Galaxy and RNA-SeqIntroduction to Galaxy and RNA-Seq
Introduction to Galaxy and RNA-Seq
 
SOT short course on computational toxicology
SOT short course on computational toxicology SOT short course on computational toxicology
SOT short course on computational toxicology
 
ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, Japan
ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, JapanISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, Japan
ISA-Tab Standards at Metabolomics Society Meeting, Tsuruoka 2014, Japan
 
Strategies for Processing and Explaining Distributed Queries on Linked Data
Strategies for Processing and Explaining Distributed Queries on Linked DataStrategies for Processing and Explaining Distributed Queries on Linked Data
Strategies for Processing and Explaining Distributed Queries on Linked Data
 
Connectivity > documents > structures > bioactivity
Connectivity > documents > structures > bioactivityConnectivity > documents > structures > bioactivity
Connectivity > documents > structures > bioactivity
 

Recently uploaded

Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...
Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...
Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...
ZurliaSoop
 
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
David Celestin
 
Uncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoUncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac Folorunso
Kayode Fayemi
 

Recently uploaded (20)

LITTLE ABOUT LESOTHO FROM THE TIME MOSHOESHOE THE FIRST WAS BORN
LITTLE ABOUT LESOTHO FROM THE TIME MOSHOESHOE THE FIRST WAS BORNLITTLE ABOUT LESOTHO FROM THE TIME MOSHOESHOE THE FIRST WAS BORN
LITTLE ABOUT LESOTHO FROM THE TIME MOSHOESHOE THE FIRST WAS BORN
 
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdfAWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
 
Ready Set Go Children Sermon about Mark 16:15-20
Ready Set Go Children Sermon about Mark 16:15-20Ready Set Go Children Sermon about Mark 16:15-20
Ready Set Go Children Sermon about Mark 16:15-20
 
Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...
Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...
Jual obat aborsi Jakarta 085657271886 Cytote pil telat bulan penggugur kandun...
 
Introduction to Artificial intelligence.
Introduction to Artificial intelligence.Introduction to Artificial intelligence.
Introduction to Artificial intelligence.
 
Dreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio IIIDreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio III
 
Digital collaboration with Microsoft 365 as extension of Drupal
Digital collaboration with Microsoft 365 as extension of DrupalDigital collaboration with Microsoft 365 as extension of Drupal
Digital collaboration with Microsoft 365 as extension of Drupal
 
ICT role in 21st century education and it's challenges.pdf
ICT role in 21st century education and it's challenges.pdfICT role in 21st century education and it's challenges.pdf
ICT role in 21st century education and it's challenges.pdf
 
BIG DEVELOPMENTS IN LESOTHO(DAMS & MINES
BIG DEVELOPMENTS IN LESOTHO(DAMS & MINESBIG DEVELOPMENTS IN LESOTHO(DAMS & MINES
BIG DEVELOPMENTS IN LESOTHO(DAMS & MINES
 
My Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle BaileyMy Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle Bailey
 
History of Morena Moshoeshoe birth death
History of Morena Moshoeshoe birth deathHistory of Morena Moshoeshoe birth death
History of Morena Moshoeshoe birth death
 
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
Proofreading- Basics to Artificial Intelligence Integration - Presentation:Sl...
 
Zone Chairperson Role and Responsibilities New updated.pptx
Zone Chairperson Role and Responsibilities New updated.pptxZone Chairperson Role and Responsibilities New updated.pptx
Zone Chairperson Role and Responsibilities New updated.pptx
 
BEAUTIFUL PLACES TO VISIT IN LESOTHO.pptx
BEAUTIFUL PLACES TO VISIT IN LESOTHO.pptxBEAUTIFUL PLACES TO VISIT IN LESOTHO.pptx
BEAUTIFUL PLACES TO VISIT IN LESOTHO.pptx
 
Uncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoUncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac Folorunso
 
Report Writing Webinar Training
Report Writing Webinar TrainingReport Writing Webinar Training
Report Writing Webinar Training
 
Lions New Portal from Narsimha Raju Dichpally 320D.pptx
Lions New Portal from Narsimha Raju Dichpally 320D.pptxLions New Portal from Narsimha Raju Dichpally 320D.pptx
Lions New Portal from Narsimha Raju Dichpally 320D.pptx
 
lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.
 
Call Girls Near The Byke Suraj Plaza Mumbai »¡¡ 07506202331¡¡« R.K. Mumbai
Call Girls Near The Byke Suraj Plaza Mumbai »¡¡ 07506202331¡¡« R.K. MumbaiCall Girls Near The Byke Suraj Plaza Mumbai »¡¡ 07506202331¡¡« R.K. Mumbai
Call Girls Near The Byke Suraj Plaza Mumbai »¡¡ 07506202331¡¡« R.K. Mumbai
 
SOLID WASTE MANAGEMENT SYSTEM OF FENI PAURASHAVA, BANGLADESH.pdf
SOLID WASTE MANAGEMENT SYSTEM OF FENI PAURASHAVA, BANGLADESH.pdfSOLID WASTE MANAGEMENT SYSTEM OF FENI PAURASHAVA, BANGLADESH.pdf
SOLID WASTE MANAGEMENT SYSTEM OF FENI PAURASHAVA, BANGLADESH.pdf
 

Iccs9th - Do Protein Targets segregates?

  • 2. Aim A. Zaliani 9thICCS 2 • Bioinformaticians were able to segregate protein targets by several means from 1D to 3D and 4D • We have potent means to perform same analysis from ligand standpoint: o Fingerprint (e.g. 2D,3D, interactionFP, etc) o Shape Descriptors o Grid • Do we appreciate their peculiarities? • Would our structural knowledge grow, if we knew some frequent target-directing structural pattern?
  • 3. Start – Method A. Zaliani 9thICCS 3 • Plenty of late work trying to link protein structures, functions and cavities to ligands (and vice versa) through similarity concepts • I would here stress not new methods but what we have already in our hands to boost ideas with couple of applications with freely available software (like KNIME, R) • FP = Do we appreciate their peculiarities enough? • Can we look into statistical models? If yes, do we?
  • 4. Different FingerPrint (FP) for different scopes A. Zaliani 9thICCS 4 • Can FP explain us this? FP Type Tan-Distance MW,LogP,HA(CDK)… 0.000 Layered(RDKit) 0.082 AtomPairs (RDKit) 0.098 Indigo(GGA) 0.190 Morgan(RDKit) 0.302 FeatMorgan(RDKit) 0.348 ErG* 0.375 Similarity ≈ 0.62-65 *N. Stiefl et al. JCIM.,(2006), 46(1)208; N. Stiefl et al. JCIM, (2006), 46(2)587
  • 5. A. Zaliani 9thICCS 5 ErG = pharmacophore-fingerprint Development of ErG (Extended reduced Graph), a 2D-pharmacophoric similarity tool for virtual screening ErG is much less substructure-dependent so that: •Opens opportunities in library design (scaffold-hopping) •Multiple-to-one correspondence of chemical substructures to pharmacophoric patterns  ‘abstract’ •Similarity searching & ‘scaffold-hopping’ documented •FP interpretable as each bit corresponds to the count of pharmacophore pair distances in graph •Atom types [6] generate pairs [21] x max_distance [15] = 315 bits Graph N N Ac D + Ac D + Hf Ac D + Hf Ar Ar Charge / H-Bonding Hydrophobic endcaps Abstract ring forms Ac D + Hf Ar Ar N N Ac D + Ac D + Hf Ac D + Hf Ar Ar Charge / H-Bonding Hydrophobic endcaps Abstract ring forms Ac D + Hf Ar Ar RDF vectorization AcAcd1,AcAcd2,…,AcDod4,…,ArHfd4,…..,+-d15 Cpd_A,0, 0, …,1, …,1, …,0
  • 6. Experiment plan - Dataset A. Zaliani 9thICCS 6 • From a literature database select a relevant random subset (ca.17K) literature compounds showing at least one activity (pEx50>6) towards a precise target among class families like GPCR-A, Kinases, Proteases or NHR • Data are high quality in terms of consistency • Less than 5% of entire Pharma Database of Evolvus • To check homogeneity all vs. all similarity evaluation with TanDistance under different FP…..
  • 7. A. Zaliani 9thICCS 7 Liceptor Database Targets Annotated • GPCR’s • Ion- Channels • CNS Transporters • Kinases • Proteases • Phosphatases Client Proprietary Targets Small Molecule Ligand Database Features Liceptor database can be customized with client specified additional fields and custom data annotation • 3.2 Million Structures • > 1000 Targets • Global Patents • Med Chem. Journals • Data annotated from 1967 • Multiple Target Data • 2D Structures • Molecular Descriptors • IC50 and Unified Values • Therapeutic Indications
  • 8. Pharmacophore-based FP better A. Zaliani 9thICCS 8 RDKit FP RDKit Feature Morgan FP
  • 9. Experiment plan - Dataset A. Zaliani 9thICCS 9
  • 10. Experiment plan – Classification Model A. Zaliani 9thICCS 10 • Partition Tree model generated • Platform (KIN, GPCRA, NHR, PROT) can be predicted with 15 ErG distances only • If shuffled on Y, models generated with ave errors ranging 63-77% (100x) • External predictions at 82,6%
  • 11. Target Family Classification Model A. Zaliani 9thICCS 11
  • 12. Learn from missclassified A. Zaliani 9thICCS 12 • 15 Distances enough to segregate 17K compounds in four classes • From model some insights can be extracted: • Example KIN relevant features: i. Presence of Ar-NH(OH) [DoArd1>0] ii. Absence of a-aminoacid signature AcDod3 =0 iii. Need of AcArd3 >0 if i. applies or =1 6H-Benzo[c]chromen-6-one derivatives as selective ERβ agonists Bioorganic & Medicinal Chemistry Letters 16, (6), 2006, Pages 1468-1472
  • 13. Learn from missclassified A. Zaliani 9thICCS 13 • 15 Distances enough to segregate 17K compounds in four classes • From model some insights can be extracted: • Example KIN relevant features: i. Presence of Ar-NH(OH) [DoArd1>0] ii. Absence of a-aminoacid signature AcDod3 =0 iii. Need of AcArd3 >0 if i. applies or =1
  • 14. Classification Model – What to learn A. Zaliani 9thICCS 14 • 15 Distances enough to segregate 17K compounds in four classes • From model some insights can be extracted: • PROTEASE Target relevant features: i. Presence of AA signature AcDoD3 ii. Presence of AcArd3 iii. Absence/Presence of max 1 HfArd4
  • 15. Classification Model – How do we use this A. Zaliani 9thICCS 15 • We can try to use these as smarts query into PDB http://www.pdb.org/pdb/search/advSearch.do • PROTEASE Target relevant features: i. Presence of AA signature AcDoD3 ii. Presence of AcArd3 iii. Presence of max 1 HfArd4 • Results of query after removal of non polypeptide, solvents, chain duplicates • 101 complexes of which 53% correct proteases • If only i.&iii. Were used, then 1141 hits found with 738 protease complexes (65%) retrieved
  • 16. Single Family Classification Models • Each Target Family could also be modeled through classification • KNIME offers several functions for: o Data preparation o Training/Test split with stratification on population o Data reduction performed with an exhaustive retrograde selection o Cross-validation with 100X Leave-10%-out o Shuffled-Y 100 classification models built for negative test o Performance statistics given on 25% external test set A. Zaliani 9thICCS 16
  • 17. Classification Model – NHR A. Zaliani 9thICCS 17
  • 18. Classification Model – NHR A. Zaliani 9thICCS 18 Ave. Distance Profiles
  • 19. A. Zaliani 9thICCS 190,00 0,10 0,20 0,30 0,40 0,50 0,60 0,70 0,80 0,90 1,00 Classification Model – NHR
  • 20. A. Zaliani 9thICCS 20 Classification Model – Kinase
  • 21. Classification Model – Kinase A. Zaliani 9thICCS 21 Ave. Distance Profiles
  • 22. A. Zaliani 9thICCS 22 Classification Model – Kinase
  • 23. A. Zaliani 9thICCS 23 Classification Model – GPCRA
  • 24. Classification Model – GPCRA A. Zaliani 9thICCS 24 Ave. Distance Profiles
  • 25. A. Zaliani 9thICCS 25 Classification Model – GPCRA
  • 26. Lessons learned here • QC-based database essential • 2D Pharmacophoric FP approach is enough but has to be “understood” • Making FP less cryptic help understanding potentialities and limits • Targets do segregate. Ligands help us realizing this, the more the more precise • Pharmacophoric Graph Space is immensely less problematic than chemical space • Provocation: how big is graph space of IP? A. Zaliani 9thICCS 26
  • 27. Limitations • Question: you find what you already know? • Question: Do abstraction help us? • Every FP method is ok, provided that teaches us something • Promiscuity reduction is not the only final aim (controlled promiscuity might be a need) • Graph distances might be too general • 2D Pharmacophoric fingerprinting to be improved A. Zaliani 9thICCS 27
  • 28. Future work • 3D distances (3Dtriangles) could easily implemented • Combinations of ligand FP and cavity FP could be really a breakthrough to have a grip on multi-pharmacology • FP Weighting for atomic de-solvation contribution is, for me, KEY • Agonist/antagonist split • pEX50 >6 will provide different pictures? A. Zaliani 9thICCS 28
  • 29. Acknowledgements A. Zaliani 9thICCS 29 Prof. M. Berthold Greg Landrum Nik Stiefl Aniket Ausekar, CEO Vikram Palshikar Rashmi Jain Mike Bodkin
  • 31. A. Zaliani 9thICCS 31 Approach to Polypharmacology • Pharmacophore target family mapping using Neural Networks (Kohonen) • Cpds mapped together with annotated actives from different sources (MDDR, UBI, etc.) • Clustering method to suggest pharmacophore similarity (Ext.Reduced Graphs fingerprint) SOM Binary ErG on 9444 cpds with pIC50>8 pIC50_8_SOM8_8_1M_Z (x value) 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 Protease GCPRa Kinases NHR Transporter Neuron 7,3 775cpds from different families N N S O O N N 2425712 pIC50(PR)=8.79 N Cl N N O O 450207 pIC50(NPY_V)=8.79