BIOINFORMATICS AND DATABASES IN BIOINFORMATICS.pdf

Dr. Harisingh Gour Viswavidyalaya
A Central University
DEPARTMENT OF ZOOLOGY
TOPIC – DATABASES IN BIOINFORMATICS
MID II ASSIGNMENT
ZOO – SEC – 128
SUBMITED TO – MR. ANUPAM KUMAR
SUBMITED BY –
PRAVANJAN DASH
ROLL NO. – Y23265020, Msc 1st YEAR, 1st SEMESTER

INTRODUCTION OF DATABASE
BIOLOGICAL DATABASES are
 Collection of files containing records of biological data in
machine readable form Can be accessed, added, retrieved,
manipulated and modified.
 Store, manage, connect and distribute data.
 Data are arranged by sets of rules which are programmed
into software that manages the data called Database
Management System or DBMS.
 A biological database is a collection of data that is
structured, searchable, updated periodically and cross
referenced.
 The data is stores, maintained, annotated, curated and
stored for public/research use.
 Data collected and organized in a specific but useful way

Classification based on type of data stored
 Primary Databases: Contain original data in the form of
primary sequence data or structural data as submitted by the
scientific community.
 Secondary Databases: Contain information that has been
processed and derived from the raw data available in primary
database.eg: PROSITE, PRINTS, BLOCKS etc..
 Composite Databases: Collect and present data after
comparing and filtering them from different primary databases
and exhibit only the non redundant sequences.

PRIMARY DATA VERSUS SECONDARY DATA
PRIMARY DATA
• Primary data is a type of data researchers
directly collect from main sources.
• Includes real-time data.
• Collected to address a current research
problem.
• Accessing primary data includes a relatively
long process.
• Data collection tools include observations,
surveys, questionnaires, physical testing,
online questionnaires, personal or telephone
interviews, case studies, and focused group
discussions.
SECONDARY DATA
• Secondary data refers to already existing data
produced by the previous researchers.
• Related to the past.
• Primarily collected to address previously
existed research problems and can be used
to address the current research problem as
well.
• Referring to secondary data is quick and easy.
• Data collection tools include journal articles,
websites, books, government publications,
records, etc.

PRIMARY DATABASES
 Primary databases contain original biological data. They are
archives of raw sequence or structural data submitted by the scientific
community.
 Once given a database a accession number, the data in primary
database are never changed.
 There are three (Genbank, EMBL, DDBJ) major public sequence
databases that store raw nucleic acid sequence data produced and
submitted by researchers worldwide.
 SOME PRIMARY DATABASES
Nucleic acid databases: Gen Bank, EMBL, DDBJ
Protein sequence databases: PIR, Swiss-Prot, UNIPROT
Protein structure database: PDB
Metabolic databases: KEGG

SECONDARY DATABASE
• Secondary database contain additional information
derived from the analysis f data available in primary
sources. econdary databases are analysed in a variety
Of ways and contain different formation in different
formats.
• SOME SECONDARY DATABASES ARE
 TrEMBL
 Pfam
 PROSITE
 Profiles
 SCOP
 CATH

NUCLEOTIDE SEQUENCE DATABASE
• Composed of a group of nucleotide sequence entries.
• Data repositories that accept nucleic acid sequence data
and make it freely available to the public.
• All the three are members of the International Nucleotide
Sequence Database Consortium (INSDC) and interchange
data.
• GenBank, EMBL, DDBJ are principal nucleotide
databases.

PROTEIN SEQUENCE DATABASES
 An array of amino acid sequence entries arranged
according to the identification number.
 Well known protein sequence databases available
on www are
 Swiss-Prot
 PIR
 UNIPROT

PROTEIN STRUCTURE DATABASE
 Many proteins which exhibit a common evolutionary
origin, show structural similarities.
 Dissimilar proteins exhibit changes in primary, secondary,
teritiary and quarternary structures.
 Similar or dissimilar protein structure can be predicted
with structure database.
 These databases store a collection of three dimensional
structures of proteins.
 EXAMPLE IS pluggable database (PDB) .

BIOINFORMATICS AND DATABASES IN BIOINFORMATICS.pdf

Recommended

Recommended

More Related Content

Similar to BIOINFORMATICS AND DATABASES IN BIOINFORMATICS.pdf

Similar to BIOINFORMATICS AND DATABASES IN BIOINFORMATICS.pdf (20)

Recently uploaded

Recently uploaded (20)

BIOINFORMATICS AND DATABASES IN BIOINFORMATICS.pdf