UniProt & Ontologies How ontologies are used in the context of a  large life sciences database  Eric Jain Swiss Institute of Bioinformatics, Geneva April 2007
What is UniProt?
+ +
UniMES UniRef UniParc ... UniProtKB
UniParc 8.9M sequences
50% 1.5M 90% 2.9M 100% 4.5M UniRef 8.9M clusters
UniProtKB 4.5M entries reviewed 0.3M Species & organelles Description of function etc Keywords & GO Description of sequence features Sequence(s) Literature Citations Cross-References Protein & gene names
What is an Ontology?
unique identifier names  and synonyms relationships within the ontology stable! mapped to other ontologies human-readable definitions machine-readable definitions
Why?
Practical. Navigation, auto-completion etc
 
Consistency! More than one way to say one thing...
Aggregate. i.e. set-oriented views
Automate?
What
Keywords Taxonomy Enzyme Pathways Tissues Subcellular Locations Cellular Components Gene Ontology
 
 
 
 
 
 
Summary Increased use of ontologies is  inevitable  as data volumes grow. UniProt has (or is in the process of introducing) several ontologies. What data will be "ontologized" and how detailed the ontologies  are depends on  your feedback ! [email_address]
beta.uniprot.org login: guest/amazing

UniProt & Ontologies