Your SlideShare is downloading. ×
0
Molecular similarity searching methods, seminar
Molecular similarity searching methods, seminar
Molecular similarity searching methods, seminar
Molecular similarity searching methods, seminar
Molecular similarity searching methods, seminar
Molecular similarity searching methods, seminar
Molecular similarity searching methods, seminar
Molecular similarity searching methods, seminar
Molecular similarity searching methods, seminar
Molecular similarity searching methods, seminar
Molecular similarity searching methods, seminar
Molecular similarity searching methods, seminar
Molecular similarity searching methods, seminar
Molecular similarity searching methods, seminar
Molecular similarity searching methods, seminar
Molecular similarity searching methods, seminar
Molecular similarity searching methods, seminar
Molecular similarity searching methods, seminar
Molecular similarity searching methods, seminar
Molecular similarity searching methods, seminar
Molecular similarity searching methods, seminar
Molecular similarity searching methods, seminar
Molecular similarity searching methods, seminar
Molecular similarity searching methods, seminar
Molecular similarity searching methods, seminar
Molecular similarity searching methods, seminar
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Molecular similarity searching methods, seminar

769

Published on

Here we present a new method of classifying the similar molecules using

Here we present a new method of classifying the similar molecules using

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
769
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
35
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • 1
  • Each bit in the fingerprint represents one molecular fragment
  • Transcript

    • 1. Molecular similarity By: Haytham Hijazisearching methods Advisor: Univ-Prof. Hon-Prof. Dr. Dieterin drug discovery RollerA Presentation in advanced graphicalengineering systems seminar 2011/2012 1
    • 2. In this work, I propose a contribution to the field of “Cheminformatic”. Cheminformatic means solving chemical problems using computational methods[1].James Rhodes, Stephen Boyer1, Jeffrey Kreulen, Ying Chen, Patricia Ordonez, “Mining patents using molecular similaritysearch”, IBM, Almaden Services Research, Pacific Symposium on Biocomputing 12:304-315(2007). Molecular similarity By: Haytham Hijazi searching methods Advisor: Univ-Prof. Hon-Prof. Dr. Dieter in drug discovery Roller A Presentation in advanced graphical engineering systems seminar 2011/2012 2
    • 3. Agenda •The main question in this research •The principle of similarity •Drug discovery as an application •Research problem • Molecular representations (1D, 2D…) •Searching the similarity •Similarity coefficients calculations •The probabilistic model (BIM) •The contribution (MDC) •Experiments, conclusions and discussion 3A Presentation in advanced graphical engineeringsystems seminar 2011/2012
    • 4. “The similarity is in the eye of the beholder” Shape Colour Size Pattern 4
    • 5. Question: Which molecules in a database are similar to the query molecule?Application: •better compounds than initial lead compound (Drug discovery) •Property prediction of unknown compound. 5
    • 6.  Structurally similar molecules are assumed to have similar biological properties.  Similar biological propritiesdrug discovery. [1]1. Sylvaine Roy and Laurence Lafanechère, “Chemogenomics and Chemical Genetics: A Users Introduction forBiologists, Chemists and Informaticians”, Molecular similarity, Springer Berlin, ISBN 978-3-642-19614-0, 1st Edition. 6
    • 7. Claim: General manufacturing problems! 7
    • 8. Similarity coefficients Molecule Feature selection calculations andrepresntation ranking for search 8
    • 9.  Historical progression ◦ Complete structure ◦ Sub-Structure  Descriptors ◦ 1D (psychophysical properties), 2D, 3D, and 4D  Connectivity tables and graph theory!Image Source: Karine Audouze, “Representation of molecular structures and structural 9diversity”, ChemoInformatics in Drug Discovery, 2009.
    • 10. SMILES CCCC1=NN(C2=C1NC(=NC2=O)C3=C(C= CC(=O)OC1=CC=CC=C1C(=O)O CC(=C3)S(=O)(=O)N4CCN(CC4)C)OCC)C SMILES – Simplified Molecular Line Entry SystemSource: Karine Audouze, “Representation of molecular structures and structural 10diversity”, ChemoInformatics in Drug Discovery, 2009.
    • 11.  A fingerprint is a vector encoding the presence (‘1’) or absence (‘0’) of FRAGMENT substructures in a molecule  Dictionary based or and hash based fingerprints Descriptor Fragment 1 AR 2 CCCCN 3 Me 9 NH2 [1] [2]2. Source: Karine Audouze, “Representation of molecular structures and structural diversity”, 11ChemoInformatics in Drug Discovery, 2009.
    • 12.  In 3D keys the position of each bit corresponds to a certain range of distances or angels.  Computationally complexSource: Karine Audouze, “Representation of molecular structures and structural 12diversity”, ChemoInformatics in Drug Discovery, 2009.
    • 13. Similarity coefficients Molecule Feature selection calculations andrepresntation ranking for search 13
    • 14.  Exact structure search Structure search Substructure search Similarity searching: maximal common sub graph isomorphism, Tanimoto/Dice/Cosine coefficients 14
    • 15.  The similarity measure (coefficient) is a quantitative measure of similarity Used to rank the results of the query Results are ordered decreasingly Distance coefficients. Probabilistic coefficients. Correlation coefficients. Association coefficients. 15
    • 16. Associative Simple matching coefficient (c+d)/(a+b-c+d) Jaccard measure (Tanimoto) c/(a+b-c) =AND/OR Cosine, Ochiai c/√(a+b)(c+d) Dice c/.5[(a+c)+(b+c)] and 2c/a+b Distance Hamming distance a+b-2c Euclidean distance √a+b-2c Soregel distance a+b-2c/a+b-c Other coefficients Pattern difference ab/(a+b c+d)2 Size (a-b)2/(a+b+c+d)2Naomie Salim, “The study of probability model for compound similarity searching”, UTM Research 16Management Centre Project Vote – 75207, University of Malaysia, 2009
    • 17.  Assume we generate the fingerprint fragment based bits Molecule A: 00010100010101000101010011110100 Molecule B: 00000000100101001001000011100000 c Tanimoto coefficient = Where c=A AND B (a b) c Tanimoto=6/(13+8)-6=0.4 a c b 17
    • 18.  Associate the relevance of a structure to an explicit feature  pi=probability that bit bi appears in an active structure.  qi=probability that bit bi appears in an inactive structure  αi represents a binary selector. If αi=1 means the bit occurs in the structure, else it is 0 and negated.  P (A|S) is the probability of an active structure given S.  P (NA|S) is the probability of an inactive structure given S.  P(A) is the probability of ACTIVEs  P(NA) is the probability of INACTIVESNaomie Salim, “The study of probability model for compound similarity searching”, UTM Research 18Management Centre Project Vote – 75207, University of Malaysia, 2009
    • 19. Claim: General manufacturing problems ! 19
    • 20. Molecular dynamicsimulating tool Active compounds Database Psychophysical properties Voting Class 1 Classification Class 2 Algorithm Class n 20
    • 21.  Better insight about the similarity in terms of bioactivity, toxicity, reactivity...(+) The time of searching (+) Prediction and voting possibilities (+) Cost of simulation tools (-) Classification errors (-) 21
    • 22.  Materials Explorer Itemtracker -Freezer/Cryogen sample tracking system CHARMM MDynaMix 22
    • 23. Fingerprint time gneration 30 25 20 Time (Ms) 15 2 bits 10 3 bits 5 4 bits 4 bits 0 3 bits 4 2 bits 5 6 7 8 Max path.length Consider if we have more than 1000 bits!Data source: simulating tool indicated in the report [17] 23
    • 24. Hit rate 0.18 0.16 0.14 0.12 0.1 Hit Rate 0.08 Hit rate 0.06 0.04 0.02 0 0 500 1000 1500 2000 2500 Selection Size The more we increase the size of features, the more the hit rate of finding actives decreaes.Data source: simulating tool indicated in the report [17] 24
    • 25.  Even fingerprint fragment based is time consuming Probabilistic models and machine learning introduced substantial changes Mixing more than type of descriptors seems efficient i.e. Time and results quality Still need to have experimental results 25
    • 26. Molecular similarity Thanks for your listeningsearching methodsin drug discovery Haytham Hijazi A Presentation to the advanced graphicalengineering systems seminar 2011/2012 26

    ×