3. It consists of two sections:
Swiss-Prot Tr-EMBL (Translated European Molecular Biological
Laboratory)
Reviewed Unreviewed
Manually annotated Computationally annotated
Records with information extracted from literature
and curtor-evaluated computational analysis.
Records that await full manual annotation.
4.
5. Introduction
• Created at the Department of Medical Biochemistry of the University
of Geneva and works in collaboration with the European Molecular
biology laboratory (EMBL), since 1987.
• Swiss-Prot strives to provide high level of annotation, minimal level of
redundancy and integration with other databases.
• It is now an equal partnership between the EMBL and the Swiss
Institute of Bioinformatics (SIB).
• TrEMBL, a computer-annotated supplement to Swiss-Prot.
• Similar format to European Bioinformatics Institute Nucleotide
Sequence Data (EMBL).
6. Features of Swiss-Prot
• Annotation
• Minimal Redundancy
• Integration with other databases
• Documentation
7. Annotation Data
Core Data Annotation
Sequence data Post-translational modifications for example
phosphorylation, acetylation etc.
The citation information (bibliographical references) Domains and sites. For example calcium binding
regions , zinc fingers.
Taxonomic data (description of the biological source
of the protein).
Secondary structure. For example alpha helix, beta
sheet, etc.
Quaternary structure. For example homodimer,
heterodimer, etc.
Diseases associated with deficiencies in the protein.
8. Minimal Redundancy
• Much of data comes from more than one literature report.
• Data condensed and merged to appear more concise and coherent.
• Conflicts in data are listed for each entry.
9. Integration with other databases
• Swiss- Prot provides cross-references to external data collections.
• Integration between the three types of sequence-related databases
(nucleic acid sequences, protein sequences and protein tertiary
structures.
12. TrEMBL: A computer-annotated supplement to
Swiss-PROT
• TrEMBL (translation of EMBL nucleotide sequence database) in 1996..
Why TrEMBL?
Increase data flow from genome projects to the sequence databases.
To maintain the high annotation quality.
To make sequences available as quickly as possible.
TrEMBL consists of computer-annotated entries derived from the
translation of all coding sequences (CDS) in the nucleotide sequence
databases, except for CDS already included in Swiss-PROT.
It also contains protein sequences extracted from the literature and
protein sequences submitted directly by the user community.
13. TrEMBL
Sp-TrEMBL (SWISS PROT-TrEMBL) REM-TrEMBL (Remaining TrEMBL)
Contains sequences, which will eventually be
incorporated into SWISS-PROT.
Contains those sequences which will not be
incorporated into SWISS-PROT..
For eg synthethic sequences, patent application
sequences, fragments of less than 8 amino acids and
coding sequences where there is strong experimental
evidence that the sequence does not code for a real
protein.
14.
15.
16. Conclusion
• Swiss-Prot continuously enhanced its format and content to adjust to
the wide knowledge pool in proteomics along with high quality of
annotation.
• Automated annotation procedures are used for Swiss-Prot in a very
conservative manner.
• The extensive integration of SWISS-PROT with specialized datbases
enables users to navigate through the current knowledge in the life
Siences providing an insight into the universe of proteins.
• Swiss-Prot continoiusly enhanced its format and content to adjust to
the wide knowledge pool in proteomics along with high quality of
annotation.
17. • Automated annotation procedures are used for Swiss-Prot in a very
conservative manner.
• The extensive integration of SWISS-PROT with specialized databases
enables users to navigate through the current knowledge in the Life
Sciences providing an insight into the universe of proteins.