Presage database

TEACHER IN-CHARGE
DR. SAGAR P. KANEKAR
- AKSHAY MORE

Outline
-Abstract
-Background
-Database Model
-Annotations
1. Experimental
2. Prediction
-Facilities
-Availability
-Conclusion

Background
Structural Genomics
- first used by Barry Honig, Wayne Hendrickson, and colleagues ( 1997) in the
context of solving structures across whole genomes.
- describes the high-throughput generation of new protein structures and their
analysis in the context of emerging genome sequence data (Terry Gaasterland,
July 1998).
Aim- to characterize the structure of the genome.
- provide an experimental structure or a good theoretical model for every protein in
all completed genomes.

Approaches:
1) Experimental:
- provides essential information about a relatively small number of individual
proteins.
2) Computational:
- expands knowledge obtained from experiments and apply it to the potentially
large families of related proteins.
- They are first used to assign protein structures to genomic proteins.
- The remaining proteins are clustered into families, and representatives from these
families are selected for experimental characterization. The newly solved
structures are compared with other proteins of known structure in classifications
such as SCOP, CATH or FSSP , to yield information about their evolution and
thence about function.

Why PRESAGE developed?
No co-ordination in the selection of new structures in PDB.
Impact - multiple groups to inadvertently begin studies on the same protein, even
though there are more than enough important families to go around.
 Computational studies have often been performed in isolation, with researchers
unaware of their colleagues’ efforts or the details of their work.
 Lack of consistent organization and repositories for these data.

PRESAGE DATABASE
Protein Resource Entailing Structural Annotation of Genomic Entities.
ͽ Aim - to improve communication among structural genomics researchers.
To achieve this,
• provides a repository of capsule information about progress in the field.
• aids in the distribution of this knowledge to the biology research community.

Database Model
 Core - a database of protein sequences (derived from SWISS-PROT + TrEMBL)
Unlike SWISS-PROT, the authors of the database do not create and edit these
annotations.
Instead, any active structural genomics researcher may submit information.
Original contributors retain full credit for their annotations.
Entries have links with information about the contributor & optional links to
relevant literature references and associated Web sites.
Db also provide annotated summary data and analyses.

ANNOTATIONS
 Fundamental unit of information in PRESAGE;
 Attached to a single protein sequence entry.
 records the name of the annotator, the date on which it was entered.
Annotations have details specific to their class,
Permits free-text comments, listings of relevant papers with MEDLINE
references, and links to other Web sites associated with the annotation.
Two classes:
1) Experimental
2) Prediction

1) Experimental Annotations:
 Indicates that a protein has been selected for structure determination and tracks
the progress towards the solved structure.
e.g. NCBI/HUGO Human Genome Sequencing Index
(http://www.ncbi.nlm.nih.gov/ HUGO/ ) that records sequencing efforts; preventing
inadvertent overlapping studies.
Experimental annotators record the stages their experiments have reached and
specific details associated with those stages.
2) Prediction Annotations:
Computational biologists can register predicted structure of proteins at 3 levels

A)Level 1: Assignment
associates a region of the sequence with a known structure, and asserts that the
two proteins will share a common fold.
B) Level 2 : Alignment
augments this information by indicating how the database sequence maps onto the
solved structure.
C) Level 3: Model
provides predicted three-dimensional coordinates for the protein sequence.

FACILITIES
• Retrieval of entries-
Several methods , including
- searches by various identifiers [ those used by SWISS-PROT and TrEMBL ,
GenBank] or
- by keywords in the SWISS-PROT description and comments about the proteins.
• Awareness Function-
- allows a user to register interest in a protein, and he will receive Email
notification when annotations are made to that protein.

Availability
• Database is publicly available at http://presage.stanford.edu/ .
• Contributors and individuals wishing to use the awareness function may register
on-line, through links from that page.

Conclusion
• the database will help to link researchers in the decentralized field of
structural genomics.
• It will help to make their results readily available.

Presage database

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Presage database

Similar to Presage database (20)

More from Akshay More

More from Akshay More (9)

Recently uploaded

Recently uploaded (20)

Presage database