This document discusses biological databases. It defines a biological database as a collection of structured, searchable, and periodically updated biological data like protein sequences, molecular structures, and DNA sequences. It notes that biological data is heterogeneous, high-volume, uncertain, dynamically changing, and integrated from various global sources. The key functions of biological databases are to make biological data available worldwide in a computer-readable format. They are broadly classified into sequence, structure, and pathway databases. Some examples of important biological databases discussed are Swiss-Prot, PDB, GenBank, and COGs.
4. Thus
• Biological Database.
• Biological database is a collection of data of
biological information (protein sequencing,
molecular structure, DNA sequences,) which is
• Structured/organised
• searchable/accessed ,
• managed /updated periodically
and
cross referenced .
• The database administrator updates data from time to
time by editing existing data and adding new data.
5. Features of Biological data
• Data Heterogeneity – Biological database are diversified. For example
• A. Sequences ( DNA/RNA/ Protein sequences ) .
• B. graphs representing metabolic pathway, signaling pathway , gene
regulatory, Genetic maps, and structured taxonomy.
• C. High dimensional data include microarray data ( expressed Gene data)
of many genes. D.- Shape- consists 3D molecular structure to study the
docking behavior , drug designing.
• E.-Temporal data- used to study electrophysiology, developmental biology,
protein structure dynamics. cellular structure dynamics.
• F. Patterns –It si used to study pattern that characterize gene expression in
relation to promoter transcription factor . regulator molecules etc.
• G. Model data- represent computational mathematical or statistical
model used for parameter estimation ,testing etc.
• H. Scalar & Vector field Under this charge distribution across cell surface ,
Ca & protein fluxes are studied
6. Feature continued…..
• High Volume – Besides heterogeneity , the data are also
voluminous.
• Uncertainty-The biological data are uncertain as the observed and
assumed data are yet to be true
• Data curation- Automatically the data of biological database are
curated to understand the missing link and inconsistencies as they
are generated quickly.
• Large Scale Data Integration – Globally generated large data of
laboratories are integrated together to form a database .
• Data sharing –Genertaed biological data are verified by the
cuentists to confirm its reproducibility.
• Dynamic and subject to continually change – To avoid the
contradiction or discrepancy of the old and new data, there is need
to develop new organizational database scheme .
7. Function of Biological database
• Make biological data available to scientists ,
researchers or any one all across the world.
• To make available data in computer readable
form rather than in printed form
8. Classification scheme of database
Biological database can be broadly classified into sequence , structure
and pathway databases .sequence database is applied to both protein
& nucleic acids sequences where structure is applied to only protein
9. Main types of data are
• Nucleotide sequence ( DNA & mRNA)
• Protein sequences
• 3 D protein structures
• Complete Genome and its map
10. Other important data types
• metabolic pathways and molecular interactions
• Mutations and polymorphisms in molecular sequences
and structures as well as original structures and tissue
types
• genetic maps
• physicochemical data
• gene expression profiles
• two-dimensional DNA chip images of mRNA and
expressions
• two-dimensional gel electrophoresis images of protein
extraction data
11. Objective of Developer of database
The chief objective of developing database is to organize data in a set
of structure records so as to make easy retrieval of information
12. • Database not only holds the opertional data
but also descriptive data
• First biological database available is atlas of
protein sequence and stucture by Margeret
Belle Dayhoff ( American Physical chemist in
1955)
• In 1970 computer become the storage unit of
information
13. • In 1990 when University was connected to
worldwide then it becomes the standard
methods for communicating biological data
14. First database _ Insulin Protein (1956)
First nucleic Acid sequence t RNA
of yeast
Fist protein strctured datbase in
1971 developed in brook haven
national laboratory in PDB
17. Classification
• Primary
• Secondary example SCOP, CATCH Prosite and E
motif etc
• composite
A secondarry structure database contain entry from protein
database(PDB) but in an organised mannerand theses
pstructure are classifed as all alpha protein and all beta
protein It also containd conserved structural motif
18.
19. • Composite datbase contains variety of
promary resource datbase providing link to
many uresorces