DRUG TARGET PREDICTION USINGSEMANTIC LINKED DATA  Bin Chen  Ph.D. candidate  School of Informatics and Computing, Indiana ...
Beyond Data IntegrationChem2Bio2RDF.ORG
What is DrugTarget?
Why Drug Target Prediction?
How to predict drug target?
?                   Drug 1                  Target 1•Substructure•Side effect•Chemical ontology•Gene expression profile   ...
?        Drug 1             Target 1                              •Sequence                    bind      •3D structure    ...
Troglitazon                                                    e               Chemical ontology                          ...
troglitazone    troglitazone                   Semantic Linked                   Network
Topology is important for association           hasSubstructure   hasSubstructure            bind  Cmpd 1                 ...
Semantic is important for    associationCmpd       bind           Protein       bind                     bind   Protein   ...
(Semantic Link Association Prediction)
Data schema
Path finding                               Drug: Troglitazone>300k nodes, >1million edges                                 ...
Statistical Model---edge weight         bind                           PPIhasPathway                             P(i   j) ...
Statistical Model---path raw  scoreCmpd1        e1    Target                       e2    Target (s)         e4            ...
Path patternPath examples:                                               ProteiCmpd             Protei          bind      ...
Statistical Model---path z score• Randomly sample 100,000 drugtarget pairs, yielding 453,087paths, 35 patterns•Plot Path p...
Statistical Model---associationscore                Fit association scores of random pairs to normal distribution
Association Score distribution among different pairs
Comparing SLAP with linkprediction methods in socialscience       AUC=0.92                  ROC curve
Drug polypharmacology profiles                                                        Bisoprolol                          ...
Drugsimilaritynetwork   •Nodes present   drugs   •Two nodes are   linked if they are   similar in terms   of biological   ...
Dissimilar Drugs have sameindication  Insomnia related drugs
Drug repurposing                                            allergic rhinitis                                 ?           ...
Summary   Semantic Link association can be assessed by    topology and semantics of the network    Domain knowledge play...
Team   Prof. Ying Ding   Prof. David Wild
http://chem2bio2rdf.org/slap
Thanks!
Backup slides
SLAP Pipeline           Path filtering
Chem2Bio2RDF data                                                                                    Other data venders   ...
Ranking Target associated    chemicals   Randomly select three targets   Select all target associated    chemicals as po...
Two objects are related if they are related tosame objects                   Coauthorship                    Same Target
Two objects are related if their relatedobjects are related
Similar Drugs have distinctindications  Levodopa:                 Methyldopa :  dopaminergic agent        antiadrenergic  ...
Association Score distribution among different pairs Direct: drug target interacts with each other physically Indirect: in...
drug target prediction using semantic linked data
drug target prediction using semantic linked data
Upcoming SlideShare
Loading in …5
×

drug target prediction using semantic linked data

1,093 views

Published on

beyond data integration, what else can we do? this presentation shows how we use semantic linked data for drug target prediction.

  • Be the first to comment

drug target prediction using semantic linked data

  1. 1. DRUG TARGET PREDICTION USINGSEMANTIC LINKED DATA Bin Chen Ph.D. candidate School of Informatics and Computing, Indiana University binchen@indiana.edu http://cheminfo.informatics.indiana.edu/~binchen CSHALS, Feb 23, 2012
  2. 2. Beyond Data IntegrationChem2Bio2RDF.ORG
  3. 3. What is DrugTarget?
  4. 4. Why Drug Target Prediction?
  5. 5. How to predict drug target?
  6. 6. ? Drug 1 Target 1•Substructure•Side effect•Chemical ontology•Gene expression profile bind Drug 2 From Ligand (drug) perspective
  7. 7. ? Drug 1 Target 1 •Sequence bind •3D structure •Gene Ontology •Ligand Target 2From target perspective
  8. 8. Troglitazon e Chemical ontology bind bind hypoglycemic ACSL4 drug pathway PPARA bind PPAR GOChemical ontology signaling bind Eicosapentaen pathway Chemical ontology oic Acid Rosiglitazo Response to ne Pioglitazon nutrient bind e GO pathway bind bind PPARG
  9. 9. troglitazone troglitazone Semantic Linked Network
  10. 10. Topology is important for association hasSubstructure hasSubstructure bind Cmpd 1 Cmpd 2 Protein 1 hasSubstructure hasSubstructure bind Cmpd 1 Cmpd 2 Protein 1
  11. 11. Semantic is important for associationCmpd bind Protein bind bind Protein Cmpd 2 1 2 1 bind hasGO hasGO Protein GO:000 ProteinCmpd1 2 01 1 bind Protein PPI ProteinCmpd1 2 1 hasSideeffect hyperten hasSide ffect bind ProteinCmpd1 Cmpd 2 sion 1 substruc ProteinCmpd1 hasSubstructure hasSubstructure Cmpd 2 bind ture1 1
  12. 12. (Semantic Link Association Prediction)
  13. 13. Data schema
  14. 14. Path finding Drug: Troglitazone>300k nodes, >1million edges Target :PPARG
  15. 15. Statistical Model---edge weight bind PPIhasPathway P(i j) =1/3 Target Target I PPI J hasGO PPI hasGO
  16. 16. Statistical Model---path raw scoreCmpd1 e1 Target e2 Target (s) e4 2 e3 1 (t)
  17. 17. Path patternPath examples: ProteiCmpd Protei bind bind Cmpd bind n 1 n2 1 1 ProteiCmpd Protei bind bind Cmpd bind n 3 n1 4 5Path pattern: Protei ProteiCmpd bind bind bind n Cmpd n
  18. 18. Statistical Model---path z score• Randomly sample 100,000 drugtarget pairs, yielding 453,087paths, 35 patterns•Plot Path pattern scoredistribution•Convert path score to path zscore Path Pattern Score Distribution
  19. 19. Statistical Model---associationscore Fit association scores of random pairs to normal distribution
  20. 20. Association Score distribution among different pairs
  21. 21. Comparing SLAP with linkprediction methods in socialscience AUC=0.92 ROC curve
  22. 22. Drug polypharmacology profiles Bisoprolol Nadolol Acebutolol Alprenolol Betaxolol Trichlormethiazide Metolazone Chlorthalidone Enalapril Trandolapril Fosinopril Quinapril Benazepril Rescinnamine Moexipril Polypharmacology profile comprises of association scores of one drug against over one thousand targets
  23. 23. Drugsimilaritynetwork •Nodes present drugs •Two nodes are linked if they are similar in terms of biological function. •Nodes are colored by their therapeutic indications
  24. 24. Dissimilar Drugs have sameindication Insomnia related drugs
  25. 25. Drug repurposing allergic rhinitis ? Anti-Parkinsonhttp://www.ebi.ac.uk/chebi/searchId.do?chebiId=3398
  26. 26. Summary Semantic Link association can be assessed by topology and semantics of the network Domain knowledge plays an important role!
  27. 27. Team Prof. Ying Ding Prof. David Wild
  28. 28. http://chem2bio2rdf.org/slap
  29. 29. Thanks!
  30. 30. Backup slides
  31. 31. SLAP Pipeline Path filtering
  32. 32. Chem2Bio2RDF data Other data venders compound protein/gene chemogenomics literature othersChem2Bio2RDF Datasets Chen, B., Dong. X., Jiao, D., Wang, H., Zhu, Q., Ding, Y., Wild, D.J. Chem2Bio2RDF: a semantic framework for linking and data mining chemogenomic and systemshttp://chem2bio2rdf.org chemical biology data. BMC Bioinformatics, 2010, 11:255
  33. 33. Ranking Target associated chemicals Randomly select three targets Select all target associated chemicals as positive link Randomly select equal number of chemicals that are not associated with the target Compare with Naïve bayes using Molecular Weight, ALogP, number of hydrogen bond acceptors and donors, the number of rotatable bonds and FCFP_6 as descriptors Leave one out validation
  34. 34. Two objects are related if they are related tosame objects Coauthorship Same Target
  35. 35. Two objects are related if their relatedobjects are related
  36. 36. Similar Drugs have distinctindications Levodopa: Methyldopa : dopaminergic agent antiadrenergic Anti-parkinson drug Antihypertensive drug Slap similarity: p value>0.05 Tanimoto coefficient=0.89
  37. 37. Association Score distribution among different pairs Direct: drug target interacts with each other physically Indirect: indirect interaction (e.g., change gene expression) Random: random drug target pairs

×