Introduction to STRING

494 views
465 views

Published on

SEMM, IFOM, Milan, Italy, June 15-16, 2006

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
494
On SlideShare
0
From Embeds
0
Number of Embeds
10
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Introduction to STRING

  1. 1. Introduction to STRING Lars Juhl Jensen EMBL Heidelberg
  2. 2. STRING
  3. 3. integrate diverse evidence
  4. 4. functional interactions
  5. 6. hundreds of proteomes
  6. 7. Ensembl
  7. 8. SWISS-PROT
  8. 9. prokaryotes
  9. 10. genomic context methods
  10. 11. gene fusion
  11. 13. gene neighborhood
  12. 15. phylogenetic profiles
  13. 20. Cell Cellulosomes Cellulose
  14. 21. eukaryotes
  15. 22. data integration
  16. 24. curated knowledge
  17. 25. MIPS Munich Information center for Protein Sequences
  18. 26. Reactome
  19. 27. KEGG Kyoto Encyclopedia of Genes and Genomes
  20. 28. STKE Signal Transduction Knowledge Environment
  21. 29. literature mining
  22. 30. co-mentioning
  23. 31. NLP Natural Language Processing
  24. 32. M EDLINE
  25. 33. SGD Saccharomyces Genome Database
  26. 34. The Interactive Fly
  27. 35. OMIM Online Mendelian Inheritance in Man
  28. 36. primary experimental data
  29. 37. microarray expression data
  30. 38. GEO Gene Expression Omnibus
  31. 39. SMD Stanford Microarray Database
  32. 40. physical protein interactions
  33. 41. BIND Biomolecular Interaction Network Database
  34. 42. MINT Molecular Interactions Database
  35. 43. GRID General Repository for Interaction Datasets
  36. 44. DIP Database of Interacting Proteins
  37. 45. HPRD Human Protein Reference Database
  38. 46. problems
  39. 47. many sources
  40. 48. different gene identifiers
  41. 49. many types of evidence
  42. 50. questionable quality
  43. 51. not directly comparable
  44. 52. spread over many species
  45. 53. parsers
  46. 54. synonyms lists
  47. 55. quality scores
  48. 56. benchmarking
  49. 57. orthology
  50. 58. how is it actually done?
  51. 59. gene fusion
  52. 60. Find in A genes that match a the same gene in B Exclude overlapping alignments Calibrate against KEGG maps Calculate all-against-all pairwise alignments
  53. 61. gene neighborhood
  54. 62. Identify runs of adjacent genes with the same direction Score each gene pair based on intergenic distances Calibrate against KEGG maps Infer associations in other species
  55. 63. phylogenetic profiles
  56. 64. Align all proteins against all Calculate best-hit profile Join similar species by PCA Calculate PC profile distances Calibrate against KEGG maps
  57. 65. literature co-occurrence
  58. 66. Associate abstracts with species Identify gene names in title/abstract Count (co-)occurrences of genes Test significance of associations Calibrate against KEGG maps Infer associations in other species
  59. 67. physical interaction data
  60. 68. Make binary representation of complexes Yeast two-hybrid data sets are inherently binary Calculate score from number of (co-)occurrences Calculate score from non-shared partners Calibrate against KEGG maps Infer associations in other species Combine evidence from experiments
  61. 69. calibrate against KEGG
  62. 71. transfer by orthology
  63. 73. orthologous groups
  64. 75. fuzzy orthology
  65. 76. ? Source species Target species
  66. 77. combine all evidence
  67. 79. Acknowledgments <ul><li>The STRING team (EMBL) </li></ul><ul><ul><li>Christian von Mering </li></ul></ul><ul><ul><li>Berend Snel </li></ul></ul><ul><ul><li>Martijn Huynen </li></ul></ul><ul><ul><li>Sean Hooper </li></ul></ul><ul><ul><li>Samuel Chaffron </li></ul></ul><ul><ul><li>Julien Lagarde </li></ul></ul><ul><ul><li>Mathilde Foglierini </li></ul></ul><ul><ul><li>Peer Bork </li></ul></ul><ul><li>Literature mining project (EML Research) </li></ul><ul><ul><li>Jasmin Saric </li></ul></ul><ul><ul><li>Rossitza Ouzounova </li></ul></ul><ul><ul><li>Isabel Rojas </li></ul></ul>
  68. 80. Thank you!

×