SlideShare a Scribd company logo
DATABASES
DATABASE
 Information available and related to a particular topic or subject is called as data.
 A database is a computerized archive used to store and organize data in such a way that information can be
retrieved easily via a variety of search criteria.
 Computerized databases offer many facilities and utilities:
 It is easy to search and obtain required information.
 Redundancy of data can be reduced. This also avoids inconsistencies in the data, since any change to the data
need not be carried out at several places in the database.
 The data can be shared more easily because a database may be accessed by several users simultaneously.
 The data can be authenticated and standards can be enforced more easily.
2
BIOLOGICAL DATABASE
A collection of biological data arranged in computer readable form that enhances the speed of search and retrieval
and is convenient to use is called a biological database. A range of information collected from scientific
experiments, published literature, information regarding biological sequences, structures, binding sites, metabolic
interactions, functional relationships, protein families, motifs (a short conserved region in a DNA sequence or
protein) and homologs (biological molecules related to one another by divergent evolution from a common
ancestor) etc., can be retrieved from these databases. They link knowledge obtained from various fields of biology
and medicine.
 Biological databases are of the following types:
1. Primary database
2. Secondary database
3. Composite database
3
PRIMARY DATABASES
Primary databases store raw experimental data and
contain only sequence or structure information. The
different types of primary databases are
4
1. Primary nucleic acid databases
 They hold the experimentally determined nucleotide sequence information, together with the protein
sequence inferred from the conceptual translation of these nucleotide sequences.
 These are sequences submitted directly by scientists and genome sequencing groups, and sequences
taken from literature and patents.
 The three primary nucleotide sequence databases are the Nucleotide Sequence Database maintained
by EMBL, GenBank and DDBJ . These three comprise the International Nucleotide Sequence
Database Collaboration.
 Database entries are exchanged on a daily basis between these three primary nucleotide databases and
hence the three function as a virtually unified db called INSD- International Nucleotide Sequence
Database.
 These databases can be used without any legal restrictions.
5
a) GenBank
 Is a public db of all known nucleotide and protein sequences with supporting bibliographic and
biological annotation.
 Is built and maintained by NCBI.
 Besides sequence data GenBank files contain information such as accession numbers, gene names,
phylogenetic classification and references to published literature.
 Data may be submitted using BankIt- a www-based submission tool, Sequin – NCBI’s stand-alone
submission software or using Barcode Submission Tool- a web-based submission tool.
 Retrieval of data is through the Entrez System- a db retrieval system that helps access the db entries.
6
b) EMBL (European Molecular Biology Laboratory)
 Constitutes Europe’s primary nucleotide seq. resource.
 The data originates from a combination of large-scale genome sequencing projects, direct submissions
from individual scientists and the European Patent Office.
 There is a quarterly release of the whole database while new and updated records are distributed daily.
 EMBL db entries are grouped into divisions based mainly on taxonomy with a few exceptions like the
new HTG (High-Throughput Genome Sequences) and GSS ( Genome Survey Sequences) divisions, for
which grouping is based on the specific nature of the underlying data. Thus divisions provide subsets of
the database which reflect the areas of interest of many users. The EMBL db currently consists of 17
divisions with each entry belonging to exactly one division.
 The database can be accessed or sequences can be retrieved via the EBI SRS server (Sequence
Retrieval System) or the FTP server or using the Dbfetch (database fetch) – a tool for simple sequence
retrieval via http.
7
c) DDBJ (DNA Data Bank of Japan)
 Is the only nucleotide sequence databank in Asia certified to collect nucleotide sequences from
researchers and to issue the internationally recognized accession number to data submitters.
 It collects sequence data mainly from Japanese researchers.
 The principle purpose of DDBJ operations is to improve the quality of INSD i.e. when researchers
make their data open to public through INSD, scientists at DDBJ make efforts to describe
information on the data as rich as possible, according to the unified rules of INSD.
 For submitting their data, Japanese genome teams use mass submission tool –MST.
8
2. PRIMARY PROTEIN SEQUENCE
DATABASES
They contain entries which describe protein domains, families and
functional sites. They also contain associated patterns and profiles to
identify protein domains and families.
Swiss-Prot, TrEMBL (translated EMBL) and PIR (Protein
Information Resource) are the primary protein databases and are
different from the nucleotide databases. These databases are curated, i
e., they are created and maintained by groups of scientists.
9
Swiss-Prot
 Swiss-Prot tries to provide a high level of annotation (such as the description of the function of a
protein, its domain structure, post translational modifications, variants etc) and a minimum level of
redundancy. It has a high level of integration with other databases.
 The Swiss-Prot entry contains large number of annotations. Each line begins with two letters, many of
which are self-explanatory. Eg. ID (identity), AC (accession number), DT (date), DE (description), GN
(gene name), CC (comment) etc..
 Swiss-Prot not only presents a fairly comprehensive description of the protein and its functions but also
provides cross references to the relevant entries in the secondary databases like PROSITE, PRINTS,
Pfam, etc..
 The Swiss-Prot database has some legal restrictions. The entries themselves are copyrighted, but freely
accessible and usable by academic researchers. Commercial companies must pay a license fee to use
Swiss-Prot.
10
TrEMBL
 TrEMBL is a computer annotated supplement of Swiss Prot and contains all the
translations of the EMBL sequence entries that are not yet integrated in Swiss-Prot.
The annotation of an entry in TrEMBL has not reached the standards required for
inclusion into Swiss-Prot. As further data ensure the reliability of annotations,
TrEMBL entries are moved to Swiss-Prot.
 Swiss-Prot and TrEMBL are developed by the Swiss-Prot groups at Swiss
Institute of Bioinformatics (SIB) and at European Bioinformatics Institute (EBI).
11
PIR
 PIR is a protein sequence database of functionally annotated protein sequences. It tries to be
comprehensive, well organised, accurate and consistently annotated. It does not reach the level of
completeness in entry annotation as does Swiss- Prot.
 It is a division of NBRF (National Biomedical Research Foundation) in the US
 It has collaborated with EBI and SIB to establish the UniProt (universal protein database), that provides a
single, centralised, authoritative resource for protein sequences and functional information.
 PIR also produces the NRL-3D -a database of sequences extracted from the 3D structures in the PDB.
The NRL 3D database makes the sequence information in PDB available for similarity searches and
retrieval and provides cross reference information for use with other PIR protein sequence databases.
 The Swiss-Prot and PIR overlap extensively but there are still many sequences which can be found only
in one.
12
3. PRIMARY STRUCTURE
DATABASE
 They pertain to macromolecular structure and store data on
protein and nucleic acid structure. The primary resource for
protein structure data is the Protein Data Bank (PDB). It is
the worldwide archive of structural data maintained by the
Research Collaboratory for Structural Bioinformatics
(RCSB), at Rutgers University. The associated Nucleic acid
Data Bank (NDB) is also maintained here.
13
 It is the main primary database for 3D structures of biological macromolecules.
 Data from X-ray crystallography and NMR spectroscopic studies are deposited in the PDB
(using a web-based interface called AutoDep Input Tool). The data are extensively checked and
verified by human curators before acceptance.
 It also accepts experimental data used to determine the structures and homology models.
 PDB entries contain atomic coordinates, and some structural parameters connected with atoms.
 PDB entries are annotated but are not as comprehensive as in Swiss-Prot
 There are no legal restrictions on the use of PDB.
 It was established in 1970 at the Brookhaven lab New York, US. It is maintained by RCSB
(Research Collaboratory for Structural Bioinformatics).
14
 Secondary databases
are databases having information derived from the data in the primary database. They
consolidate, summarise, standardise, classify, index and comment on primary databases. These
are very important for inferring protein function. Examples are PROSITE, PRINTS, BLOCKS,
etc..
 Composite databases
Amalgamates the information held in two or more of the primary databases. This means that
only one database needs be searched rather than do multiple searches on individual primary
dbs.
Eg: OWL- SwissProt, PIR, GenPept and NRL3D
NRDB- SwissProt and TrEMBL.
15
 Organism specific databases
Contain information, links and resources dedicated to particular species. They contain information
on sequence data, gene expression, mutant phenotypes, genome maps, genome sequencing projects
and relevant scientific literature and provide links to resources for obtaining clones, mutants as well
as for contacting researchers.
Eg. EcoGene – database for E.coli, Mouse Genome Database (MSD) for mouse, OMIM (Online
Mendilian Inheritance in Man)
 Specialised sequence databases
These databases have particular types of nucleic acid or protein sequences deposited in them. For
example, there are databases specifically for rRNA and tRNA sequences.
16
 Commercial databases
Unlike public databases which can be accessed freely by anyone using the WWW, commercial databases
require subscription as they are the result of a single company’s research and investment.
Eg. Incyte, UniGene etc.
 Literature databases
A literature database contains the abstracts and in some cases, the full text and figures of published articles.
Such databases can be searched using text strings to find words in the title, abstract, keywords, or by author
or author’s institution. Medline was one of the earliest comprehensive online library resources. It has now
been incorporated into a large resource called PubMed maintained by the NCBI. Other examples are the
Web of Science and BioMedNet.
17
Thank you…

More Related Content

Similar to DATABASES...............................pptx

Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...
BibiQuinah
 
Bioinformatics introduction
Bioinformatics introductionBioinformatics introduction
Bioinformatics introduction
DrGopaSarma
 
Bioinformatics biological databases
Bioinformatics biological databasesBioinformatics biological databases
Bioinformatics biological databases
Sangeeta Das
 
Nucleic acid and protein databanks
Nucleic acid and protein databanksNucleic acid and protein databanks
Nucleic acid and protein databanks
NithyaNandapal
 
Bioinformatics in biotechnology by kk sahu
Bioinformatics in biotechnology by kk sahu Bioinformatics in biotechnology by kk sahu
Bioinformatics in biotechnology by kk sahu
KAUSHAL SAHU
 
Biological databases
Biological databasesBiological databases
Biological databases
SHRADHEYA GUPTA
 
Proteome databases
Proteome databasesProteome databases
Proteome databases
Prasanthperceptron
 
Protein Databases
Protein DatabasesProtein Databases
Protein Databases
SATHIYA NARAYANAN
 
Primary sequencing of nucleic acids
Primary sequencing of nucleic acidsPrimary sequencing of nucleic acids
Primary sequencing of nucleic acids
vibhakumari12
 
Proteins databases
Proteins databasesProteins databases
Proteins databases
Hafiz Muhammad Zeeshan Raza
 
Introduction to Bioinformatics and DatabasesDay1.ppt
Introduction to Bioinformatics and DatabasesDay1.pptIntroduction to Bioinformatics and DatabasesDay1.ppt
Introduction to Bioinformatics and DatabasesDay1.ppt
khadijarafiq2012
 
BITS: Overview of important biological databases beyond sequences
BITS: Overview of important biological databases beyond sequencesBITS: Overview of important biological databases beyond sequences
BITS: Overview of important biological databases beyond sequences
BITS
 
database retrival.pdf
database retrival.pdfdatabase retrival.pdf
database retrival.pdf
SrimathideviJ
 
Biological databases
Biological databasesBiological databases
Biological databases
Sarfaraz Nasri
 
Primary Databases.pptx
Primary Databases.pptxPrimary Databases.pptx
Primary Databases.pptx
Swarup Malakar
 
Introduction OF BIOLOGICAL DATABASE
Introduction OF BIOLOGICAL DATABASEIntroduction OF BIOLOGICAL DATABASE
Introduction OF BIOLOGICAL DATABASE
PrashantSharma807
 
Intro to databases
Intro to databasesIntro to databases
Intro to databases
bhargvi sharma
 
Major biological nucleotide databases
Major biological nucleotide databasesMajor biological nucleotide databases
Major biological nucleotide databases
Vidya Kalaivani Rajkumar
 
Databases
DatabasesDatabases
Databases
afzamalik
 
Nucleic Acid Sequence databases
Nucleic Acid Sequence databasesNucleic Acid Sequence databases
Nucleic Acid Sequence databases
Pranavathiyani G
 

Similar to DATABASES...............................pptx (20)

Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...
 
Bioinformatics introduction
Bioinformatics introductionBioinformatics introduction
Bioinformatics introduction
 
Bioinformatics biological databases
Bioinformatics biological databasesBioinformatics biological databases
Bioinformatics biological databases
 
Nucleic acid and protein databanks
Nucleic acid and protein databanksNucleic acid and protein databanks
Nucleic acid and protein databanks
 
Bioinformatics in biotechnology by kk sahu
Bioinformatics in biotechnology by kk sahu Bioinformatics in biotechnology by kk sahu
Bioinformatics in biotechnology by kk sahu
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Proteome databases
Proteome databasesProteome databases
Proteome databases
 
Protein Databases
Protein DatabasesProtein Databases
Protein Databases
 
Primary sequencing of nucleic acids
Primary sequencing of nucleic acidsPrimary sequencing of nucleic acids
Primary sequencing of nucleic acids
 
Proteins databases
Proteins databasesProteins databases
Proteins databases
 
Introduction to Bioinformatics and DatabasesDay1.ppt
Introduction to Bioinformatics and DatabasesDay1.pptIntroduction to Bioinformatics and DatabasesDay1.ppt
Introduction to Bioinformatics and DatabasesDay1.ppt
 
BITS: Overview of important biological databases beyond sequences
BITS: Overview of important biological databases beyond sequencesBITS: Overview of important biological databases beyond sequences
BITS: Overview of important biological databases beyond sequences
 
database retrival.pdf
database retrival.pdfdatabase retrival.pdf
database retrival.pdf
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Primary Databases.pptx
Primary Databases.pptxPrimary Databases.pptx
Primary Databases.pptx
 
Introduction OF BIOLOGICAL DATABASE
Introduction OF BIOLOGICAL DATABASEIntroduction OF BIOLOGICAL DATABASE
Introduction OF BIOLOGICAL DATABASE
 
Intro to databases
Intro to databasesIntro to databases
Intro to databases
 
Major biological nucleotide databases
Major biological nucleotide databasesMajor biological nucleotide databases
Major biological nucleotide databases
 
Databases
DatabasesDatabases
Databases
 
Nucleic Acid Sequence databases
Nucleic Acid Sequence databasesNucleic Acid Sequence databases
Nucleic Acid Sequence databases
 

More from Cherry

Large scale production of streptomycin.pptx
Large scale production of streptomycin.pptxLarge scale production of streptomycin.pptx
Large scale production of streptomycin.pptx
Cherry
 
INDUSTRIAL PRODUCTION OF ETHANOL.....pptx
INDUSTRIAL PRODUCTION OF ETHANOL.....pptxINDUSTRIAL PRODUCTION OF ETHANOL.....pptx
INDUSTRIAL PRODUCTION OF ETHANOL.....pptx
Cherry
 
AMYLASE..............................pptx
AMYLASE..............................pptxAMYLASE..............................pptx
AMYLASE..............................pptx
Cherry
 
Penicillin...........................pptx
Penicillin...........................pptxPenicillin...........................pptx
Penicillin...........................pptx
Cherry
 
RETROGRESSIVE CHANGES, CONCEPT OF CLIMAX COMMUNITIES AND RESILIENCE OF COMMU...
RETROGRESSIVE CHANGES, CONCEPT OF  CLIMAX COMMUNITIES AND RESILIENCE OF COMMU...RETROGRESSIVE CHANGES, CONCEPT OF  CLIMAX COMMUNITIES AND RESILIENCE OF COMMU...
RETROGRESSIVE CHANGES, CONCEPT OF CLIMAX COMMUNITIES AND RESILIENCE OF COMMU...
Cherry
 
COMMUNITY DYNAMICS CHARACTERISTICS- CYCLIC AND NON-CYCLIC REPLACEMENT CHANGES...
COMMUNITY DYNAMICS CHARACTERISTICS- CYCLIC AND NON-CYCLIC REPLACEMENT CHANGES...COMMUNITY DYNAMICS CHARACTERISTICS- CYCLIC AND NON-CYCLIC REPLACEMENT CHANGES...
COMMUNITY DYNAMICS CHARACTERISTICS- CYCLIC AND NON-CYCLIC REPLACEMENT CHANGES...
Cherry
 
Remote sensing.......................pptx
Remote sensing.......................pptxRemote sensing.......................pptx
Remote sensing.......................pptx
Cherry
 
METHODS OF TRANSCRIPTOME ANALYSIS....pptx
METHODS OF TRANSCRIPTOME ANALYSIS....pptxMETHODS OF TRANSCRIPTOME ANALYSIS....pptx
METHODS OF TRANSCRIPTOME ANALYSIS....pptx
Cherry
 
AIZOACEAE............................pptx
AIZOACEAE............................pptxAIZOACEAE............................pptx
AIZOACEAE............................pptx
Cherry
 
Cryoprervation techniques.............pptx
Cryoprervation techniques.............pptxCryoprervation techniques.............pptx
Cryoprervation techniques.............pptx
Cherry
 
APPLICATIONS OF GM ANIMALS...........pptx
APPLICATIONS OF GM ANIMALS...........pptxAPPLICATIONS OF GM ANIMALS...........pptx
APPLICATIONS OF GM ANIMALS...........pptx
Cherry
 
Tropical coastal ecosystems...........pptx
Tropical coastal ecosystems...........pptxTropical coastal ecosystems...........pptx
Tropical coastal ecosystems...........pptx
Cherry
 
Phytogeography........................pptx
Phytogeography........................pptxPhytogeography........................pptx
Phytogeography........................pptx
Cherry
 
Structural annotation................pptx
Structural annotation................pptxStructural annotation................pptx
Structural annotation................pptx
Cherry
 
Adventitious shoot regeneration.....pptx
Adventitious shoot regeneration.....pptxAdventitious shoot regeneration.....pptx
Adventitious shoot regeneration.....pptx
Cherry
 
Tissue engineering......................pptx
Tissue engineering......................pptxTissue engineering......................pptx
Tissue engineering......................pptx
Cherry
 
Triploidy ...............................pptx
Triploidy ...............................pptxTriploidy ...............................pptx
Triploidy ...............................pptx
Cherry
 
SYNTHETIC SEED PRODUCTION.............pptx
SYNTHETIC SEED PRODUCTION.............pptxSYNTHETIC SEED PRODUCTION.............pptx
SYNTHETIC SEED PRODUCTION.............pptx
Cherry
 
Reporter genes.......................pptx
Reporter genes.......................pptxReporter genes.......................pptx
Reporter genes.......................pptx
Cherry
 
Somaclonal Variation.....................pptx
Somaclonal Variation.....................pptxSomaclonal Variation.....................pptx
Somaclonal Variation.....................pptx
Cherry
 

More from Cherry (20)

Large scale production of streptomycin.pptx
Large scale production of streptomycin.pptxLarge scale production of streptomycin.pptx
Large scale production of streptomycin.pptx
 
INDUSTRIAL PRODUCTION OF ETHANOL.....pptx
INDUSTRIAL PRODUCTION OF ETHANOL.....pptxINDUSTRIAL PRODUCTION OF ETHANOL.....pptx
INDUSTRIAL PRODUCTION OF ETHANOL.....pptx
 
AMYLASE..............................pptx
AMYLASE..............................pptxAMYLASE..............................pptx
AMYLASE..............................pptx
 
Penicillin...........................pptx
Penicillin...........................pptxPenicillin...........................pptx
Penicillin...........................pptx
 
RETROGRESSIVE CHANGES, CONCEPT OF CLIMAX COMMUNITIES AND RESILIENCE OF COMMU...
RETROGRESSIVE CHANGES, CONCEPT OF  CLIMAX COMMUNITIES AND RESILIENCE OF COMMU...RETROGRESSIVE CHANGES, CONCEPT OF  CLIMAX COMMUNITIES AND RESILIENCE OF COMMU...
RETROGRESSIVE CHANGES, CONCEPT OF CLIMAX COMMUNITIES AND RESILIENCE OF COMMU...
 
COMMUNITY DYNAMICS CHARACTERISTICS- CYCLIC AND NON-CYCLIC REPLACEMENT CHANGES...
COMMUNITY DYNAMICS CHARACTERISTICS- CYCLIC AND NON-CYCLIC REPLACEMENT CHANGES...COMMUNITY DYNAMICS CHARACTERISTICS- CYCLIC AND NON-CYCLIC REPLACEMENT CHANGES...
COMMUNITY DYNAMICS CHARACTERISTICS- CYCLIC AND NON-CYCLIC REPLACEMENT CHANGES...
 
Remote sensing.......................pptx
Remote sensing.......................pptxRemote sensing.......................pptx
Remote sensing.......................pptx
 
METHODS OF TRANSCRIPTOME ANALYSIS....pptx
METHODS OF TRANSCRIPTOME ANALYSIS....pptxMETHODS OF TRANSCRIPTOME ANALYSIS....pptx
METHODS OF TRANSCRIPTOME ANALYSIS....pptx
 
AIZOACEAE............................pptx
AIZOACEAE............................pptxAIZOACEAE............................pptx
AIZOACEAE............................pptx
 
Cryoprervation techniques.............pptx
Cryoprervation techniques.............pptxCryoprervation techniques.............pptx
Cryoprervation techniques.............pptx
 
APPLICATIONS OF GM ANIMALS...........pptx
APPLICATIONS OF GM ANIMALS...........pptxAPPLICATIONS OF GM ANIMALS...........pptx
APPLICATIONS OF GM ANIMALS...........pptx
 
Tropical coastal ecosystems...........pptx
Tropical coastal ecosystems...........pptxTropical coastal ecosystems...........pptx
Tropical coastal ecosystems...........pptx
 
Phytogeography........................pptx
Phytogeography........................pptxPhytogeography........................pptx
Phytogeography........................pptx
 
Structural annotation................pptx
Structural annotation................pptxStructural annotation................pptx
Structural annotation................pptx
 
Adventitious shoot regeneration.....pptx
Adventitious shoot regeneration.....pptxAdventitious shoot regeneration.....pptx
Adventitious shoot regeneration.....pptx
 
Tissue engineering......................pptx
Tissue engineering......................pptxTissue engineering......................pptx
Tissue engineering......................pptx
 
Triploidy ...............................pptx
Triploidy ...............................pptxTriploidy ...............................pptx
Triploidy ...............................pptx
 
SYNTHETIC SEED PRODUCTION.............pptx
SYNTHETIC SEED PRODUCTION.............pptxSYNTHETIC SEED PRODUCTION.............pptx
SYNTHETIC SEED PRODUCTION.............pptx
 
Reporter genes.......................pptx
Reporter genes.......................pptxReporter genes.......................pptx
Reporter genes.......................pptx
 
Somaclonal Variation.....................pptx
Somaclonal Variation.....................pptxSomaclonal Variation.....................pptx
Somaclonal Variation.....................pptx
 

Recently uploaded

Phenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvementPhenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvement
IshaGoswami9
 
Sciences of Europe journal No 142 (2024)
Sciences of Europe journal No 142 (2024)Sciences of Europe journal No 142 (2024)
Sciences of Europe journal No 142 (2024)
Sciences of Europe
 
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
Advanced-Concepts-Team
 
Equivariant neural networks and representation theory
Equivariant neural networks and representation theoryEquivariant neural networks and representation theory
Equivariant neural networks and representation theory
Daniel Tubbenhauer
 
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
Abdul Wali Khan University Mardan,kP,Pakistan
 
Basics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different formsBasics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different forms
MaheshaNanjegowda
 
Medical Orthopedic PowerPoint Templates.pptx
Medical Orthopedic PowerPoint Templates.pptxMedical Orthopedic PowerPoint Templates.pptx
Medical Orthopedic PowerPoint Templates.pptx
terusbelajar5
 
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốtmô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
HongcNguyn6
 
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Leonel Morgado
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
University of Maribor
 
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero WaterSharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Texas Alliance of Groundwater Districts
 
Immersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths ForwardImmersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths Forward
Leonel Morgado
 
Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.
Aditi Bajpai
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
RitabrataSarkar3
 
Thornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdfThornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdf
European Sustainable Phosphorus Platform
 
NuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyerNuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyer
pablovgd
 
Compexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titrationCompexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titration
Vandana Devesh Sharma
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
Sérgio Sacani
 
Applied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdfApplied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdf
University of Hertfordshire
 
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxThe use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
MAGOTI ERNEST
 

Recently uploaded (20)

Phenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvementPhenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvement
 
Sciences of Europe journal No 142 (2024)
Sciences of Europe journal No 142 (2024)Sciences of Europe journal No 142 (2024)
Sciences of Europe journal No 142 (2024)
 
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
 
Equivariant neural networks and representation theory
Equivariant neural networks and representation theoryEquivariant neural networks and representation theory
Equivariant neural networks and representation theory
 
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
 
Basics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different formsBasics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different forms
 
Medical Orthopedic PowerPoint Templates.pptx
Medical Orthopedic PowerPoint Templates.pptxMedical Orthopedic PowerPoint Templates.pptx
Medical Orthopedic PowerPoint Templates.pptx
 
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốtmô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
 
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
 
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero WaterSharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
 
Immersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths ForwardImmersive Learning That Works: Research Grounding and Paths Forward
Immersive Learning That Works: Research Grounding and Paths Forward
 
Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
 
Thornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdfThornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdf
 
NuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyerNuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyer
 
Compexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titrationCompexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titration
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
 
Applied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdfApplied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdf
 
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxThe use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
 

DATABASES...............................pptx

  • 2. DATABASE  Information available and related to a particular topic or subject is called as data.  A database is a computerized archive used to store and organize data in such a way that information can be retrieved easily via a variety of search criteria.  Computerized databases offer many facilities and utilities:  It is easy to search and obtain required information.  Redundancy of data can be reduced. This also avoids inconsistencies in the data, since any change to the data need not be carried out at several places in the database.  The data can be shared more easily because a database may be accessed by several users simultaneously.  The data can be authenticated and standards can be enforced more easily. 2
  • 3. BIOLOGICAL DATABASE A collection of biological data arranged in computer readable form that enhances the speed of search and retrieval and is convenient to use is called a biological database. A range of information collected from scientific experiments, published literature, information regarding biological sequences, structures, binding sites, metabolic interactions, functional relationships, protein families, motifs (a short conserved region in a DNA sequence or protein) and homologs (biological molecules related to one another by divergent evolution from a common ancestor) etc., can be retrieved from these databases. They link knowledge obtained from various fields of biology and medicine.  Biological databases are of the following types: 1. Primary database 2. Secondary database 3. Composite database 3
  • 4. PRIMARY DATABASES Primary databases store raw experimental data and contain only sequence or structure information. The different types of primary databases are 4
  • 5. 1. Primary nucleic acid databases  They hold the experimentally determined nucleotide sequence information, together with the protein sequence inferred from the conceptual translation of these nucleotide sequences.  These are sequences submitted directly by scientists and genome sequencing groups, and sequences taken from literature and patents.  The three primary nucleotide sequence databases are the Nucleotide Sequence Database maintained by EMBL, GenBank and DDBJ . These three comprise the International Nucleotide Sequence Database Collaboration.  Database entries are exchanged on a daily basis between these three primary nucleotide databases and hence the three function as a virtually unified db called INSD- International Nucleotide Sequence Database.  These databases can be used without any legal restrictions. 5
  • 6. a) GenBank  Is a public db of all known nucleotide and protein sequences with supporting bibliographic and biological annotation.  Is built and maintained by NCBI.  Besides sequence data GenBank files contain information such as accession numbers, gene names, phylogenetic classification and references to published literature.  Data may be submitted using BankIt- a www-based submission tool, Sequin – NCBI’s stand-alone submission software or using Barcode Submission Tool- a web-based submission tool.  Retrieval of data is through the Entrez System- a db retrieval system that helps access the db entries. 6
  • 7. b) EMBL (European Molecular Biology Laboratory)  Constitutes Europe’s primary nucleotide seq. resource.  The data originates from a combination of large-scale genome sequencing projects, direct submissions from individual scientists and the European Patent Office.  There is a quarterly release of the whole database while new and updated records are distributed daily.  EMBL db entries are grouped into divisions based mainly on taxonomy with a few exceptions like the new HTG (High-Throughput Genome Sequences) and GSS ( Genome Survey Sequences) divisions, for which grouping is based on the specific nature of the underlying data. Thus divisions provide subsets of the database which reflect the areas of interest of many users. The EMBL db currently consists of 17 divisions with each entry belonging to exactly one division.  The database can be accessed or sequences can be retrieved via the EBI SRS server (Sequence Retrieval System) or the FTP server or using the Dbfetch (database fetch) – a tool for simple sequence retrieval via http. 7
  • 8. c) DDBJ (DNA Data Bank of Japan)  Is the only nucleotide sequence databank in Asia certified to collect nucleotide sequences from researchers and to issue the internationally recognized accession number to data submitters.  It collects sequence data mainly from Japanese researchers.  The principle purpose of DDBJ operations is to improve the quality of INSD i.e. when researchers make their data open to public through INSD, scientists at DDBJ make efforts to describe information on the data as rich as possible, according to the unified rules of INSD.  For submitting their data, Japanese genome teams use mass submission tool –MST. 8
  • 9. 2. PRIMARY PROTEIN SEQUENCE DATABASES They contain entries which describe protein domains, families and functional sites. They also contain associated patterns and profiles to identify protein domains and families. Swiss-Prot, TrEMBL (translated EMBL) and PIR (Protein Information Resource) are the primary protein databases and are different from the nucleotide databases. These databases are curated, i e., they are created and maintained by groups of scientists. 9
  • 10. Swiss-Prot  Swiss-Prot tries to provide a high level of annotation (such as the description of the function of a protein, its domain structure, post translational modifications, variants etc) and a minimum level of redundancy. It has a high level of integration with other databases.  The Swiss-Prot entry contains large number of annotations. Each line begins with two letters, many of which are self-explanatory. Eg. ID (identity), AC (accession number), DT (date), DE (description), GN (gene name), CC (comment) etc..  Swiss-Prot not only presents a fairly comprehensive description of the protein and its functions but also provides cross references to the relevant entries in the secondary databases like PROSITE, PRINTS, Pfam, etc..  The Swiss-Prot database has some legal restrictions. The entries themselves are copyrighted, but freely accessible and usable by academic researchers. Commercial companies must pay a license fee to use Swiss-Prot. 10
  • 11. TrEMBL  TrEMBL is a computer annotated supplement of Swiss Prot and contains all the translations of the EMBL sequence entries that are not yet integrated in Swiss-Prot. The annotation of an entry in TrEMBL has not reached the standards required for inclusion into Swiss-Prot. As further data ensure the reliability of annotations, TrEMBL entries are moved to Swiss-Prot.  Swiss-Prot and TrEMBL are developed by the Swiss-Prot groups at Swiss Institute of Bioinformatics (SIB) and at European Bioinformatics Institute (EBI). 11
  • 12. PIR  PIR is a protein sequence database of functionally annotated protein sequences. It tries to be comprehensive, well organised, accurate and consistently annotated. It does not reach the level of completeness in entry annotation as does Swiss- Prot.  It is a division of NBRF (National Biomedical Research Foundation) in the US  It has collaborated with EBI and SIB to establish the UniProt (universal protein database), that provides a single, centralised, authoritative resource for protein sequences and functional information.  PIR also produces the NRL-3D -a database of sequences extracted from the 3D structures in the PDB. The NRL 3D database makes the sequence information in PDB available for similarity searches and retrieval and provides cross reference information for use with other PIR protein sequence databases.  The Swiss-Prot and PIR overlap extensively but there are still many sequences which can be found only in one. 12
  • 13. 3. PRIMARY STRUCTURE DATABASE  They pertain to macromolecular structure and store data on protein and nucleic acid structure. The primary resource for protein structure data is the Protein Data Bank (PDB). It is the worldwide archive of structural data maintained by the Research Collaboratory for Structural Bioinformatics (RCSB), at Rutgers University. The associated Nucleic acid Data Bank (NDB) is also maintained here. 13
  • 14.  It is the main primary database for 3D structures of biological macromolecules.  Data from X-ray crystallography and NMR spectroscopic studies are deposited in the PDB (using a web-based interface called AutoDep Input Tool). The data are extensively checked and verified by human curators before acceptance.  It also accepts experimental data used to determine the structures and homology models.  PDB entries contain atomic coordinates, and some structural parameters connected with atoms.  PDB entries are annotated but are not as comprehensive as in Swiss-Prot  There are no legal restrictions on the use of PDB.  It was established in 1970 at the Brookhaven lab New York, US. It is maintained by RCSB (Research Collaboratory for Structural Bioinformatics). 14
  • 15.  Secondary databases are databases having information derived from the data in the primary database. They consolidate, summarise, standardise, classify, index and comment on primary databases. These are very important for inferring protein function. Examples are PROSITE, PRINTS, BLOCKS, etc..  Composite databases Amalgamates the information held in two or more of the primary databases. This means that only one database needs be searched rather than do multiple searches on individual primary dbs. Eg: OWL- SwissProt, PIR, GenPept and NRL3D NRDB- SwissProt and TrEMBL. 15
  • 16.  Organism specific databases Contain information, links and resources dedicated to particular species. They contain information on sequence data, gene expression, mutant phenotypes, genome maps, genome sequencing projects and relevant scientific literature and provide links to resources for obtaining clones, mutants as well as for contacting researchers. Eg. EcoGene – database for E.coli, Mouse Genome Database (MSD) for mouse, OMIM (Online Mendilian Inheritance in Man)  Specialised sequence databases These databases have particular types of nucleic acid or protein sequences deposited in them. For example, there are databases specifically for rRNA and tRNA sequences. 16
  • 17.  Commercial databases Unlike public databases which can be accessed freely by anyone using the WWW, commercial databases require subscription as they are the result of a single company’s research and investment. Eg. Incyte, UniGene etc.  Literature databases A literature database contains the abstracts and in some cases, the full text and figures of published articles. Such databases can be searched using text strings to find words in the title, abstract, keywords, or by author or author’s institution. Medline was one of the earliest comprehensive online library resources. It has now been incorporated into a large resource called PubMed maintained by the NCBI. Other examples are the Web of Science and BioMedNet. 17