4. Prediction of functional associations “ Protein mode” Separate network for each species “ COG mode” One network covering all species
5. STRING provides a protein network based on integration of diverse types of evidence Genomic Neighborhood Species Co-occurrence Gene Fusions Database Imports Exp. Interaction Data Co-expression Literature Co-mentioning
6.
7. Integrating physical interaction screens Make binary representation of complexes Yeast two-hybrid data sets are inherently binary Calculate score from number of (co-)occurrences Calculate score from non-shared partners Calibrate against KEGG maps Infer associations in other species Combine evidence from experiments
8. Gene fusion: predicting physical interactions Detect multiple proteins matching to one protein Exclude overlapping alignments Infer associations in other species Calibrate against KEGG maps
9. Mining microarray expression databases Re-normalize arrays by modern method to remove biases Build expression matrix Combine similar arrays by PCA Construct predictor by Gaussian kernel density estimation Calibrate against KEGG maps Infer associations in other species
10. Gene neighborhood: predicting co-expression Identify runs of adjacent genes with the same direction Score each gene pair based on intergenic distances Calibrate against KEGG maps Infer associations in other species
11. Co-mentioning in the scientific literature Associate abstracts with species Identify gene names in title/abstract Count (co-)occurrences of genes Test significance of associations Calibrate against KEGG maps Infer associations in other species
12. Phylogenetic profile: co-mentioning in genomes Align all proteins against all Calculate best-hit profile Join similar species by PCA Calculate PC profile distances Calibrate against KEGG maps