3 d virtual screening of pknb inhibitors using data


Published on

Published in: Education
1 Like
  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

3 d virtual screening of pknb inhibitors using data

  1. 1. Abhik Seal Phd Student(Chemical Informatics) Indiana University Bloomington http://chemin-abs.blogspot.com/ mypage.iu.edu/~abseal/10/16/2012 abseal@indiana.edu 1
  2. 2. Whats Pknb ???• Ser/Thr protein kinase (STPK) highly conserved in Gram-positive bacteria and apparently essential for Mycobacterial viability.• Essential for cell division and metabolism, expressed in exponential growth and overexpression causes defects in cell wall synthesis and cell division.10/16/2012 abseal@indiana.edu 2
  3. 3. PknB binding ATP pocketGatekeeper Wehenkel,FEBS Letters 580 (2006) 3018–3022 10/16/2012 abseal@indiana.edu 3
  4. 4. Kinase inhibitor and pharmacophoresTargeting cancer with small molecule kinase inhibitors Nature Review’s Cancer Through the “Gatekeeper Door”: Exploiting the Active Kinase Conformation2009 10/16/2012 abseal@indiana.edu Chem. 2010, 53, 2681–2694 J. Med. 4
  5. 5. Properties of Kinase Inhibitors Through the “Gatekeeper Door”: Exploiting the Active Kinase Conformation J. Med. Chem. 2010, 53, 2681–269410/16/2012 abseal@indiana.edu 5
  6. 6. Some PknB inhibitors10/16/2012 abseal@indiana.edu 6
  7. 7. 10/16/2012 abseal@indiana.edu 7
  8. 8. • A data fusion algorithm accepts two or more ranked lists and merges these lists into a single ranked list with the aim of providing better effectiveness than all systems used for data fusion. (Croft,2000, Chapter 1; Meng et al., 2002).• Another aim of the data fusion is to group existing search services under one umbrella, as the number of existing search services increases (Selberg & Etzioni, 1996)• Fusion in automatic ranking of IR systems Automatic ranking of information retrieval systems using data fusion, Nuray & Can ’06• Merging the retrieval results of multiple systems. see more on wikipedia (http://en.wikipedia.org/wiki/Data_fusion)10/16/2012 abseal@indiana.edu 8
  9. 9. Used By Meta Search engines for example : (http://en.wikipedia.org/wiki/List_of_search_engines#Metasearch_engines) ex: www.dogpile.com,www.copernic.com,www.hotbot.com Meta search Engine1 Engine 2 Engine 2 D1 D2 D3 Information Resource10/16/2012 abseal@indiana.edu 9
  10. 10. Workflow of meta-search• Execute a database search for some particular target structure using different similarity measures• Note the rank position, R(i), of each database structure in the ranking for the i-th similarity measure using similarity coefficients• Combine the various positions using a fusion rule to give a new rank position for each database structure• Use these fused positions to generate the final output ranking for the search. http://www.his.se/PageFiles/6884/Peter%20Willet%20presentation.pdf10/16/2012 abseal@indiana.edu 10
  11. 11. Types of fusion for 2D similarity searcha) Similarity fusion (SF): SF involves searching a single reference structure against a database usingmultiple different similarity measures, and the output is obtained bycombining the rankings resulting from these different measures.b) Group fusion (GF):GF involves searching multiple reference structures against a database using asingle similarity measure, and the output is obtained by combining therankings resulting from these different reference structures.Holliday etal :Multiple search methods for similarity-based virtual screening: analysis of search overlap andprecision Journal of Cheminformatics 2011, 3:2910/16/2012 abseal@indiana.edu 11
  12. 12. Similarity fusion (SF) (a) WOMBAT top-1% searches; (b) WOMBAT top-5% searches. (a) MDDR top-1% searches; (b) MDDR top-5% searches.Holliday etal :Multiple search methods for similarity-based virtual screening: analysis of search overlap and precisionJournal of Cheminformatics 2011, 3:29 10/16/2012 abseal@indiana.edu 12
  13. 13. Group fusion(GF)(a) WOMBAT top-1% searches; (b) WOMBAT top-5% searches. (a) MDDR top-1% searches; (b) MDDR top-5%searches. 10/16/2012 abseal@indiana.edu 13
  14. 14. Reciprocal Rank method• Merge compounds using only rank positions• Rank score of compound i (j: system index) 1 r (d i ) 1 pos ( d ij ) j10/16/2012 abseal@indiana.edu 14
  15. 15. Reciprocal rank example• 4 systems: A, B, C, D documents: a, b, c, d, e, f, g• Query results: A={a,b,c,d}, B={a,d,b,e}, C={c,a,f,e}, D={b,g,e,f}• r(a)=1/(1+1+1/2)=0.4 r(b)=1/(1/2+1/3+1)=0.52• Final ranking of compounds: (most relev) a > b > c > d > e > f > g (least relev) Nuray, R.;Can,F. Automatic ranking of information retrieval systems using data fusion. Information Processing and Management 42 (2006) 595–61410/16/2012 abseal@indiana.edu 15
  16. 16. Sum scoreThe normalized scores of each ranking aresummed to get the fused score of a compound Ranking 1 Ranking 2 Ranking 3 Sum score Rank Compound 1 1 0.9 0.7 2.6 1 Compound 2 0.8 0.5 1 2.3 2 Compound 3 0.7 1 0.5 2.2 3 Compound 4 0.2 0 0.1 0.3 4 Compound 5 0 0.3 0 0.3 5
  17. 17. Sum rank• In sum rank ranking is done based on the sum scores the maximum score receives the minimum rank . The ranks are then summed and reranked. Ranking 1 Ranking 2 Ranking 3 Sum rank Rank Compound 1 1 10 4 15 5 Compound 2 2 5 6 13 4 Compound 3 7 4 3 14 4 Compound 4 2 3 3 8 2 Compound 5 3 2 1 6 1
  18. 18. Pharmacophore designTo generate the pharmacophoric features we used the energeticpharmacophore as developed by Salam et al with presence of exclusionspheres.Pharmacophoric sites were automatically generated with Phase using thedefault set of six chemical features: hydrogen bond acceptor (A), hydrogenbond donor (D), hydrophobic (H), negative ionizable (N),positive ionizable(P), and aromatic ring (R).
  19. 19. E-Pharmacophores10/16/2012 abseal@indiana.edu 20 E-pharmacophore I E-pharmacophore II E-pharmacophore III
  20. 20. Validation of Pharmacophores• To determine how well a hit list was for a query compound or a pharmacophore; yield of active compounds, enrichment factor, percentage actives and Goodness of a Hit list (GH score) were considered.• Also, how well a pharmacophore or any other screening method can rank compounds “early” in a virtual screening process using Boltzmann-enhanced discrimination of receiver operating characteristic (BEDROC Truchon et al) and RIE metric (Sheridan et al)• 35 active compounds randomly sampled from 62 actives along with 1000 decoys (www. schrodinger.com/ glide_decoy_set).10/16/2012 abseal@indiana.edu 21
  21. 21. Some formula’s10/16/2012 abseal@indiana.edu 22
  22. 22. Why BEDROC ??• Despite its early recognition sensitivity, the Enrichment Factor has the drawback of being insensitive to the relative ranking of the compounds in the top X% and ignoring the complete ranking of the remaining data set.• The ROC measure cannot identify the compounds ranked early in a virtual screening process.• This BEDROC metric uses an exponential decay function to reduce the influence of lower ranked compounds on the final score. The score has a parameter α that allows the user to adjust the definition of the early recognition problem.• BEDROC value for three VS methods at α=20.At α=20 implies that 80% of the the final BEDROC score is based on the first 8% of the ranked data set.10/16/2012 abseal@indiana.edu 23
  23. 23. Validation of virtual screeninga) E- pharmacophoreE-pharmacophore III was selected based on the performance measures andalso number of compounds retrieved had more than fitness 2 and also highGoodness of Hit Score, yield of actives and specificity.b) ROCSAll the compounds were scored and ranked according to Tanimoto comboscore parameters were selected as mentioned by Bostrom et al.c) Glide XP All compound were score based on the glide XP docking score. Thecompound were ranked in a descending order of scores.
  24. 24. R13 D8 E-pharmacophore II E-pharmacophore IWhich pharmacophore is good?Does sites D8 and R13 important? E-pharmacophore III
  25. 25. Results
  26. 26. Performance measuresMethod EF(1%) EF(2%) EF(5%) EF(10%) BEDROC (α=20) RIEE-pharmacophore I 11.71 11 10.51 6.8 0.538 7.81E-pharmacophore II 29.57 27.51 12.14 6.9 0.716 10.40E-pharmacophore III 29.57 27.14 13.71 7.42 0.744 10.81vROCS 29.57 26.71 13.14 7.42 0.749 10.89GlideXP 26.71 21 11.42 6.28 0.629 9.14Sum score 29.57 28.57 14.85 7.42 0.785 11.42Sum rank 29.57 24.28 12 7.42 0.703 10.21Reciprocal rank 29.57 29.57 17.14 8.85 0.875 12.73
  27. 27. AUC ROC resultsMethods AUC(1%) AUC(2%) AUC(5%) AUC(100%)E-pharmacophore III 0.56 0.602 0.649 0.832vROCS 0.58 0.62 0.62 0.89GlideXP 0.39 0.44 0.51 0.84Sum score 0.64 0.6780 0.717 0.90Sum rank 0.47 0.49 0.565 0.91Reciprocal rank 0.72 0.75 0.81 0.96
  28. 28. Architecture Data Preprocessing Rescoring and RankingSystem1 ValidationSystem 2 Fusion Algorithms DecisionSystem 3System 410/16/2012 abseal@indiana.edu 29
  29. 29. Virtual Screening of Asinex 400K compounds Workflow Chemical Structure Post processing Compound Collection 3D virtual Screening and Ranking Selection Virtual Screening Using Data Fusion Top 10% of the database• 400K • Phase E Selected for for Glide XP pharmacophore select docking compounds top 5000 compounds Data Fusion from Asinex for VS in vROCs and Using Reciprocal 45 compounds Glide SP Rank algorithm Selected after visual Optimized • Conformer generation Inspection and using ligprep and perfom ROCS pharmacophore mapping • Glide SP docking
  30. 30. Machine Learning Models under process• Tools used: a)PowerMV descriptors 2D pharmacological fingerprints,Weighted Burden Number and 8 properties b) maccs(166 keys) c) rcdk extended graph basedd) j compound mapper library PHAP2PT3 D, PHAP3PT3D ,CATS3D,CATS2D None of the descriptors till now efficient to retrieve the 3Dscreening results well.But ML model provides hope because it’s classifying active anddecoys well with polykernel SVM.
  31. 31. PCA Analysis of predicted compounds• 12 different physicochemical properties are calculated using cdk ((http://rguha.net/code/ java/cdkdesc. html) including molecular refractivity, atom polarizabilities, bond polarizabilities, hydrogen bond donors and acceptors, petitjean number, topological polar surface area, number of rotatable bonds,liphophilicity XLogP, molecular weight, topological shape and geometrical shape.
  32. 32. Hits retrieved After visual inspection and Pharmacophore mapping
  33. 33. Docking of predicted compounds
  34. 34. Tools Used• For docking and pharmacophore – Schrodinger’s Glide and phase• Shape based Screening – vROCS• Performance calculation and visualization - R statistics, ggplot2, enrichVS package.
  35. 35. More work• Working with Design of PknG inhibitors• Enhanced Ranking systems for better prediction• Automated protocol for developing enhanced virtual screening using open source tools.
  36. 36. Acknowledgements• Indo US science Technology Forum• Prof P.Yogeshwari and Prof D.Sriram (BITS Hyderabad)• Computer Aided Drug Design Lab BITS Pilani Hyderabad.• Prof David J Wild• OSDD Team