Primary and secondary database

PRIMARY AND SECONDARY BIOLOGICAL
DATABASE
By
KAUSHAL KUMAR SAHU
Assistant Professor (Ad Hoc)
Department of Biotechnology
Govt. Digvijay Autonomous P. G. College
Raj-Nandgaon ( C. G. )

CONTANTS
• INTRODUCTION
• WHAT IS DATA AND DATABASE?
• WHAT IS BIOLOGICAL DATABASE?
• TYPES OF BIOLOGICAL DATABASE
– PRIMARY DATABASE
• Nucleic acid sequence database
• Protein sequence database
– SECONDARY DATABASE
– COMPOSITE DATABASE
– TERTIARY DATABASE
• WHY NEED?
• CONCLUSION
• REFRENCES
5/11/2020
2

INTRODUCTION
Application of
computational
techniques
Management
and Analysis
of biological
Data.
Bioinformatic
History:
•The first English use of the word "data" is from the 1640s.
• Using the word "data" to mean "transmittable and
storable computer information" was first done in 1946.
•The first database was created in 1956 .
•Insulin protein is the first protein to be sequenced.
5/11/2020 3

DATA
• A series of
observations,
measurements ,
or facts;
information
and also
called: information
computing.
DATABASE
• A large
systematized collecti
on of data that can
be
expanded,updated,
and retrieved rapidly
for specific purpose.
5/11/2020 4

BIOLOGICAL DATABASE
• Storage of biological information(Nucleic
acid sequence, Protein sequence and
structure).
5/11/2020 5

DEFINATION
Biological database are computer sites
that organise, store and disseminate files that
contain information consisting of literature
references, nucleic acid sequences and Protein
sequences and structure.
5/11/2020 6

SOURCES ON THE WEB FOR IMPORTANT
DATABASE
5/11/2020 7

TYPES OF BIOLOGICAL DATABASE
1.
• Primary Database
2.
• Secondary Database
3.
• Composite Database
4.
• Tertiary Database
5/11/2020 8

Primary Database
Stores biomolecular sequences (Protein or Nucleic acid )
and associated annotation information (Organism,
species, mutation linked to particular diseases,
bibliographic etc. )
Primary sources are original materials on which research
is based.
Neither interpreted nor condensed nor evaluated by
other writers.
5/11/2020 9

PRIMARY
Nucleotide
sequences
NCBI GenBank
EMBL
DDBJ
Protein
Sequences
PIR
UniProt
SWISS-
PROT
TrEMBL
5/11/2020 10

NCBI
• Located in Bethesda, Maryland and was founded in 1988
through legislation sponsored by Senator Claude Pepper.
• Was directed by David Lipman, one of the original authors of
the BLAST.
• The NCBI houses a series of databases.
EX. : GenBank - DNA sequences.
PubMed (a bibliographic database ) - the biomedical
literature.
Other databases - Epigenomics database.
5/11/2020 11

GenBank
• A part of International nucleiotide sequence database
collaboration which comprised of EMBL, DDBJ GenBank
at NCBI.
• The database started in 1982 by Walter Goad and Los
Alamos National Laboratory.
• In 15 August 2017, GenBank release 221.0 has
203,180,606 loci, 240,343,378,258 bases, from
203,180,606 reported sequences.
https://www.revolvy.com/main/index.php?s=GenBank
5/11/2020 12

EMBL-EBI
• Established in 1980 at the EMBL laboratories in
Heidelberg, Germany.
• An international, innovative and interdisciplinary
research organisation funded by 23 member states and
two associate member states.
• Location- Hinxton, Cambridge, UK.
5/11/2020 13

DDBJ
• 1987 DDBJ release 1 was provided.
• Situated in Mishima, Japan.
5/11/2020 14

SECONDARY DATABASE
• Derived from the analysis of primary data.
• Present in the form of regular expressions(patterns),
fringerprints, blocks.
Secondary
databse
PROSITE
PRINTS
5/11/2020 17

PROSITE
• It is consists of entries describing the protein families,
domains and functional sites as wel as aminocid patterns
and profiles in them.
• Complemented by collection of rules based profiles and
pattern i.e. ProRule.
5/11/2020 18

PRINTS
• Collection of protein motif fringerprints.
• the motifs do not overlap, but are separated along a
sequence, though they may be contiguous in 3D-space.
• Fingerprints can encode protein folds and functionalities
more flexibly and powerfully than can single motifs, full
diagnostic potency deriving from the mutual context
provided by motif neighbours.
5/11/2020 19

COMPOSITE DATABASE
• Represent an amalgamation of several primary database
sources and are easy to use.
• Access all the relevant information from a single source
rather than connect to multiple resources.
Ex. NCBI, UniProt etc.
5/11/2020 20

CONCLUSION
• Bioinformatics is the application of information
technology to store, organize To make biological data
available in computer-readable form.
• We can easily analyze the vast amount of biological
data which is available in the form of sequences and
structures of proteins(the building block of organisms)
and nucleic acid (the information carrior).
• Need for storing and communicating large datasets has
grown .
• Make biological data available to scientists.
5/11/2020 21

REFERENCES
• Books:
– Bioinformatics – C.S.V.Murthy - edition-1st - 2003 .
– Bioinformatics – S.C. Rastogi - edition-1st - 2003.
• Other s source:
– https://www.ncbi.nlm.nih.gov/nuccore/NC_002371.2
– http://vle.du.ac.in/mod/book/print.php?id=8913&chapterid=12618
– https://web.expasy.org/docs/swiss-prot_guideline.html
– nd%20Managing%20Information%20Leicester/page_21.htm
– https://bioinf.comav.upv.es/courses/biotech3/theory/databases.ht
ml
5/11/2020 22

Primary and secondary database

More Related Content

What's hot

Similar to Primary and secondary database

More from KAUSHAL SAHU

Recently uploaded

Primary and secondary database

Editor's Notes