Functional association networks   The STRING and STITCH web resources Lars Juhl Jensen EMBL Heidelberg
functional associations
 
data integration
 
STRING
 
STITCH
 
373 genomes
 
model organism databases
Ensembl
Genome Reviews
RefSeq
genomic context methods
gene fusion
 
conserved neighborhood
operons
 
bidirectional promoters
 
phylogenetic profiles
 
primary experimental data
gene coexpression
 
GEO Gene Expression Omnibus
expression compendia
protein interactions
 
genetic interactions
 
BIND Biomolecular Interaction Network Database
BioGRID General Repository for Interaction Datasets
DIP Database of Interacting Proteins
IntAct
MINT Molecular Interactions Database
HPRD Human Protein Reference Database
curated knowledge
complexes
MIPS Munich Information center for Protein Sequences
Gene Ontology
pathways
 
KEGG Kyoto Encyclopedia of Genes and Genomes
Reactome
PID NCI-Nature Pathway Interaction Database
STKE Signal Transduction Knowledge Environment
literature mining
M EDLINE
SGD Saccharomyces Genome Database
The Interactive Fly
OMIM Online Mendelian Inheritance in Man
synonyms lists
co-mentioning
 
NLP Natural Language Processing
<ul><li>Gene  and protein  names </li></ul><ul><li>Cue words for entity recognition </li></ul><ul><li>Verbs for relation e...
too easy …
…  to be true
many data types
not comparable
different error rates
many sources
different file formats
different gene identifiers
redundancy
spread over many species
raw quality scores
reproducibility
 
intergenic distances
 
benchmarking
calibrate vs. gold standard
 
raw quality scores
probabilistic scores
transfer by orthology
 
two modes
COG mode
 
 
protein mode
 
 
 
combine all evidence
Bayesian scoring scheme
Acknowledgments <ul><li>Christian von Mering </li></ul><ul><li>Michael Kuhn </li></ul><ul><li>Manuel Stark </li></ul><ul><...
Upcoming SlideShare
Loading in …5
×

Functional association networks - The STRING and STITCH web resources

484 views

Published on

Exploring Protein Modular Architecture, European Molecular Biology Laboratory, Heidelberg, Germany, February 1, 2008

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
484
On SlideShare
0
From Embeds
0
Number of Embeds
9
Actions
Shares
0
Downloads
10
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Functional association networks - The STRING and STITCH web resources

  1. 1. Functional association networks The STRING and STITCH web resources Lars Juhl Jensen EMBL Heidelberg
  2. 2. functional associations
  3. 4. data integration
  4. 6. STRING
  5. 8. STITCH
  6. 10. 373 genomes
  7. 12. model organism databases
  8. 13. Ensembl
  9. 14. Genome Reviews
  10. 15. RefSeq
  11. 16. genomic context methods
  12. 17. gene fusion
  13. 19. conserved neighborhood
  14. 20. operons
  15. 22. bidirectional promoters
  16. 24. phylogenetic profiles
  17. 26. primary experimental data
  18. 27. gene coexpression
  19. 29. GEO Gene Expression Omnibus
  20. 30. expression compendia
  21. 31. protein interactions
  22. 33. genetic interactions
  23. 35. BIND Biomolecular Interaction Network Database
  24. 36. BioGRID General Repository for Interaction Datasets
  25. 37. DIP Database of Interacting Proteins
  26. 38. IntAct
  27. 39. MINT Molecular Interactions Database
  28. 40. HPRD Human Protein Reference Database
  29. 41. curated knowledge
  30. 42. complexes
  31. 43. MIPS Munich Information center for Protein Sequences
  32. 44. Gene Ontology
  33. 45. pathways
  34. 47. KEGG Kyoto Encyclopedia of Genes and Genomes
  35. 48. Reactome
  36. 49. PID NCI-Nature Pathway Interaction Database
  37. 50. STKE Signal Transduction Knowledge Environment
  38. 51. literature mining
  39. 52. M EDLINE
  40. 53. SGD Saccharomyces Genome Database
  41. 54. The Interactive Fly
  42. 55. OMIM Online Mendelian Inheritance in Man
  43. 56. synonyms lists
  44. 57. co-mentioning
  45. 59. NLP Natural Language Processing
  46. 60. <ul><li>Gene and protein names </li></ul><ul><li>Cue words for entity recognition </li></ul><ul><li>Verbs for relation extraction </li></ul><ul><li>[ nxgene The GAL4 gene ] </li></ul><ul><li>[ nxexpr T he expression of [ nxgene the cytochrome genes [ nxpg CYC1 and CYC7 ]]] is controlled by [ nxpg HAP1 ] </li></ul>
  47. 61. too easy …
  48. 62. … to be true
  49. 63. many data types
  50. 64. not comparable
  51. 65. different error rates
  52. 66. many sources
  53. 67. different file formats
  54. 68. different gene identifiers
  55. 69. redundancy
  56. 70. spread over many species
  57. 71. raw quality scores
  58. 72. reproducibility
  59. 74. intergenic distances
  60. 76. benchmarking
  61. 77. calibrate vs. gold standard
  62. 79. raw quality scores
  63. 80. probabilistic scores
  64. 81. transfer by orthology
  65. 83. two modes
  66. 84. COG mode
  67. 87. protein mode
  68. 91. combine all evidence
  69. 92. Bayesian scoring scheme
  70. 93. Acknowledgments <ul><li>Christian von Mering </li></ul><ul><li>Michael Kuhn </li></ul><ul><li>Manuel Stark </li></ul><ul><li>Samuel Chaffron </li></ul><ul><li>Philippe Julien </li></ul><ul><li>Tobias Doerks </li></ul><ul><li>Berend Snel </li></ul><ul><li>Jasmin Saric </li></ul><ul><li>Isabel Rojas </li></ul><ul><li>Peer Bork </li></ul>

×