Introduction to STRING
Upcoming SlideShare
Loading in...5

Introduction to STRING



SEMM, IFOM, Milan, Italy, June 15-16, 2006

SEMM, IFOM, Milan, Italy, June 15-16, 2006



Total Views
Slideshare-icon Views on SlideShare
Embed Views



0 Embeds 0

No embeds



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    Introduction to STRING Introduction to STRING Presentation Transcript

    • Introduction to STRING Lars Juhl Jensen EMBL Heidelberg
    • STRING
    • integrate diverse evidence
    • functional interactions
    • hundreds of proteomes
    • Ensembl
    • prokaryotes
    • genomic context methods
    • gene fusion
    • gene neighborhood
    • phylogenetic profiles
    • Cell Cellulosomes Cellulose
    • eukaryotes
    • data integration
    • curated knowledge
    • MIPS Munich Information center for Protein Sequences
    • Reactome
    • KEGG Kyoto Encyclopedia of Genes and Genomes
    • STKE Signal Transduction Knowledge Environment
    • literature mining
    • co-mentioning
    • NLP Natural Language Processing
    • M EDLINE
    • SGD Saccharomyces Genome Database
    • The Interactive Fly
    • OMIM Online Mendelian Inheritance in Man
    • primary experimental data
    • microarray expression data
    • GEO Gene Expression Omnibus
    • SMD Stanford Microarray Database
    • physical protein interactions
    • BIND Biomolecular Interaction Network Database
    • MINT Molecular Interactions Database
    • GRID General Repository for Interaction Datasets
    • DIP Database of Interacting Proteins
    • HPRD Human Protein Reference Database
    • problems
    • many sources
    • different gene identifiers
    • many types of evidence
    • questionable quality
    • not directly comparable
    • spread over many species
    • parsers
    • synonyms lists
    • quality scores
    • benchmarking
    • orthology
    • how is it actually done?
    • gene fusion
    • Find in A genes that match a the same gene in B Exclude overlapping alignments Calibrate against KEGG maps Calculate all-against-all pairwise alignments
    • gene neighborhood
    • Identify runs of adjacent genes with the same direction Score each gene pair based on intergenic distances Calibrate against KEGG maps Infer associations in other species
    • phylogenetic profiles
    • Align all proteins against all Calculate best-hit profile Join similar species by PCA Calculate PC profile distances Calibrate against KEGG maps
    • literature co-occurrence
    • Associate abstracts with species Identify gene names in title/abstract Count (co-)occurrences of genes Test significance of associations Calibrate against KEGG maps Infer associations in other species
    • physical interaction data
    • Make binary representation of complexes Yeast two-hybrid data sets are inherently binary Calculate score from number of (co-)occurrences Calculate score from non-shared partners Calibrate against KEGG maps Infer associations in other species Combine evidence from experiments
    • calibrate against KEGG
    • transfer by orthology
    • orthologous groups
    • fuzzy orthology
    • ? Source species Target species
    • combine all evidence
    • Acknowledgments
      • The STRING team (EMBL)
        • Christian von Mering
        • Berend Snel
        • Martijn Huynen
        • Sean Hooper
        • Samuel Chaffron
        • Julien Lagarde
        • Mathilde Foglierini
        • Peer Bork
      • Literature mining project (EML Research)
        • Jasmin Saric
        • Rossitza Ouzounova
        • Isabel Rojas
    • Thank you!