SlideShare a Scribd company logo
1 of 44
S.BITUILA
II MSC.
Sequence and structural databases of Dna
and protein , and its significances in scientific
researches.
DNA Databases:
ī‚§ Sequence Databases
ī‚§ Structural Databases
DNA Sequence Databases:
ī‚§ NCBI
ī‚§ EMBL
ī‚§ DDBJ
ī‚§ Ensembl
ī‚§ GenBank
ī‚§ EBI
ī‚§ UniGene
NCBI (National Centre for
Biotechnological Information)
ī‚§ Established in the year 1988
ī‚§ It aims to create public databases , develop
software tools for sequence analysis and
disseminate biomedical information, mainly to
aid the research in computational biology.
ī‚§ Roles:
-Maintains several biological databases
eg.GenBank,the nucleic acid sequence
database.
-provides data retrieval system (eg.Entrez)
-provides computational resources for the
analysis of GenBank data and a variety of other
biological databases.
Tools available in NCBI:
ī‚§ BLAST,Entrez,standard
BLAST,megaBLAST, mega BLAST,PSI-
BLAST,RPS-BLAST
ī‚§ Types of Databases :
-Nucleotide database
-Literature database
-protein database
-Gene expression
-Structural database
-Chemical and others.
EMBL(European Molecular Biology
Laboratory)
ī‚§ Established in the year 1974 by Leo Sjilard ,
James Watson and John Kendrew.
ī‚§ Roles:
-Incorporates , Organizes and Distributes
nucleotide sequences from the public sources.
-Performs basic researches in molecular
biology and medicine as well as trains Scientists,
students and visitors.
ī‚§ Tools:
-Ppsearch,GeneQuiz,FASTA,DALI,BLAST-
2,Radar,Dali-Lite etc.
DDBJ(DNA Databank of Japan)
ī‚§ Established in the year 1986
ī‚§ Roles:
-Collects nucleotide sequence data and
provides freely available nucleotide
sequence data.
-Provides supercomputer system to
support research activities in Life Sciences.
Tools:
-Getentry,SRS,TXSearch,LIBRA,GIB.
Ensemble:
ī‚§ Launched in the year 1999 in response to the imminent
completion of the Human Genome Project.
ī‚§ Joint Project between the European Bioinformatics
Institute and the welcome Trust Sanger Institute.
ī‚§ It aims to provide a centralized resource for geneticists,
molecular biologists and other researchers studying the
genomes of our own species and other vertebrates and
model organisms.
ī‚§ Genome databases for vertebrates and other eukaryotic
species .
ī‚§ It is one of the well known genome browsers for the
retrieval of genomic information.
ī‚§ Plays a major role in ENCODE (Encyclopaedia of DNA
Elements Consortium) Project.
ī‚§ Tools: BLAST ,Data Slicer, Variant Effect Predictor,
Assembly converter etc.
GenBank:
ī‚§ Started in the year 1982 by Walter Goad and Los
Alamos National Laboratory.
ī‚§ Produced and maintained by the National Centre for
Biotechnology Information (NCBI) as a part of the
International Nucleotide Sequence Database
Collaboration(INSDC)
ī‚§ Roles:
-open access ,annotated collection of all publicly
available nucleotide sequences and their protein
translations.
-Provide and encourage access within the scientific
community.
ī‚§ Tools: Bar S Tool, Sequin, BLAST,
EBI(European Bioinformatics Institute):
ī‚§ 1980
ī‚§ EMBL-EBI is a centre for research and services in
bioinformatics ,and is a part of European Molecular Biology
Laboratory(EMBL)
ī‚§ It hosts a number of publicly open ,free to use life sciences
resources ,including biomedical databases, analysis tools
and bio-ontologies which includes-;
- ArrayExpress -archive of gene expression experiments.
- BioModels - a database of computational models relevant
to the life sciences.
- BioStudies -a database that serves as a generic data
archive at EMBL-EBI for biomolecular datasets.
-European Nucleotide Archive (ENA) – resource of
Nucleotide sequencing information.
UniGene:
ī‚§ It is an NCBI database of the
transcriptome and thus ,despite the name
not primarily a database for genes.
ī‚§ It provides informations on protein
similarities, gene expression , cDNA
clones and genomic location .
DNA Structural Databases:
RNase P Database:
ī‚§ Compilation of RNase P sequences,
sequence alignments , secondary
structures, three dimensional models
and accessory information.
ī‚§ Also contains secondary structures of
bacterial and archaeal RNAs including
specially annotated ‘reference’
secondary structures of E.Coli and
Bacillus subtilis RNase P RNAs,a
minimum phylogenetic consensus
structure,and coordinates for models
of three-dimensional structure.
Protein Databases:
ī‚§ Protein Sequence Databases
ī‚§ Protein Structural Databases
Protein Sequence Databases:
ī‚§ PIR
ī‚§ SWISS-PROT
ī‚§ Trembl
ī‚§ iProclass
ī‚§ Pfam
PIR(Protein Information Resource):
ī‚§ 1984 by the National Biomedical Research
Foundation(NBRF)
ī‚§ Roles: -Source of annotated proteins
database and analysis tools for the
researchers.
ī‚§ Provides an introduction to a range of
biological database.
ī‚§ Highlights the distinction between different
data types and indicates where the most
important resources are maintained.
-It also supports genomic and
proteomic research and scientific discovery.
PIR is split into four
sections:
ī‚§ PIR1: contains fully classified and annotated entries.
ī‚§ PIR2: includes preliminary entries ,which have not
been thoroughly reviewed and may contain
redundancy .
ī‚§ PIR3 contains unverified entries ,which have not been
reviewedPIR4 entries fall into one of the four
categories:
-conceptual translations of artefactual
sequences
-conceptual translations of sequences that are
not transcribed or translated
-protein sequences or conceptual translations
that are extensively genetically engineered
-Sequences that are not genetically encoded and
not produced on ribosomes.
SWISS-Prot:
ī‚§ Founded in the year 1986 by Amos
Bairoch and developed by Swiss
Institute of Bioinformatics and
subsequently developed by Rolf
Apwelier at EBI.
ī‚§ Provides high level annotations,
including descriptions of the function of
the protein, structure of its domains, its
post translational modifications variants
etc.
ī‚§ Minimal redundancy and integration with
other databases .
TrEMBL(Translated EMBL)
ī‚§ Founded in the year 1996 as a
computer annotated supplement
to Swiss-Prot.
ī‚§ Contains translation of all coding
sequences present in EMBL,
GenBank, DDBJ Nucleotide
Sequence Databases and also
protein extracted from the
literature or submitted to Swiss-
Prot.
iPro-class (Integrated Protein
Knowledge bases)
-First released in 2000
- Provides comprehensive description of a protein
family ,function and structure for Uniprot protein
sequence.
ī‚§ It contains Value added descriptions of proteins
including family relationship at global and local
levels.
ī‚§ Serves as a framework for data integration in
distributed networking environment.
ī‚§ It can also be used to support protein sequence
annotation and genomic/proteomic research to
obtain comprehensive up-to-date information on
proteins.
Uses:
ī‚§ iPro-class provides two types of protein
sequence reports. In one type it covers
information on genetic gene family structure
function, taxonomy and literature with cross
reference to molecular database .The second
type present PIR super family membership
information with length ,taxonomy and
keyword statistics.
ī‚§ It also provides links to various molecular
biology databases.
Pfam
ī‚§ 1995 by Erik Sonhammer , Sean Eddy and
Richard Durbin as a collection of commonly
occurring protein domains that could be used to
annotate the protein coding genes of
multicellular animals.
ī‚§ It is a database of protein families.
ī‚§ Includes annotations and multiple sequence
alignment of protein families generated using
hidden Markov models.
ī‚§ The general purpose of Pfam database is to
provide a complete and accurate classification of
protein families.
ī‚§ This method has been widely adopted by
biologists because of its wide coverage of
proteins and sensible naming conventions.
Uses :
ī‚§ It is used by experimental biologists
researching specific proteins ,by structural
biologists to identify new targets for
structure determination, by computational
biologists to organize sequences and by
evolutionary biologists for tracing the
origins of proteins.
ī‚§ It also allows users to submit protein or
DNA sequences to search for matches to
families in the database.
Structural Databases of protein
;
ī‚§ PDB
ī‚§ CATH
ī‚§ SCOP
ī‚§ Gene 3D
ī‚§ D Bali
ī‚§ E-MSD
PDB(Protein DataBank);
ī‚§ 1971, by Brookhaven National Laboratory ,New
York.
ī‚§ It is a database for the three –dimensional structural
data of large biological molecules, and nucleic acids.
ī‚§ Roles:
-It is a key resource in areas of structural
biology ,such as structural genomics .
-Provides protein structures to many other
databases eg SCOP and CATH.
ī‚§ Tools:
-ADIT(auto Deep Input Tool), pdb-Extract,
OOSTAR, Open Ras Mol, CIF Tr, MAXIT, Biopython,
mmLIB,XML2PDB,
CATH( Class, Architecture, Topology
and Homology)
ī‚§ Mid 1990s by Professor Christine Orengo and colleagues including
Janet Thornton and David Jones at the University College London.
-It is a protein Structure Classification Database. and shares many
broad features with the SCOP resource.
-It provides information on the Evolutionary relationships of protein
domains .
ī‚§ Roles:
-Class; at this level the domains are assigned according to their
secondary structure content .
-Architecture , at this level , information on the secondary structure
arrangement in three dimensional space is used for assignment. It
describes the gross secondary structure content and packing.
-Topology encompasses both overall shape and connectivity of
secondary structure
-Homology groups domains that share more than 35% sequence
identity and thought to share a common ancestor.
The four levels of CATH hierarchy:
# Level Description
1. Class: The overall secondary structure
content of the domain .
2. Architectur
e:
High structural similarity but no
evidence of homology .
3. Topology: A large-Scale grouping of
topologies which share
particular structural features
4. Homolog-
ous
superfam-
ily
Indicative of a demonstrable
evolutionary relationship
SCOP( Structural Classification of
Protein)
ī‚§ 1994
ī‚§ Centre for Engineering and the Laboratory of Molecular
Biology.
ī‚§ Roles:
-Describes Structural and Evolutionary relationship
between proteins of known structure.
-Provides broad survey of all known proteins folds ,
detailed information about the close relatives of protein
and a protein and a framework for future research and
classification.
E-MSD
ī‚§ 1996
ī‚§ Provides clean Macromolecular Structure
Data
ī‚§ Accept and process depositions to the PDB.
ī‚§ Transform the PDB flat –file archive to a
relational database system.
ī‚§ Management and distribution of data on
molecular structures in close collaboration
with PDB.
ī‚§ Tools- Autodep and Emdep
Gene 3D:
ī‚§ Provides structural annotation for proteins
in the CATH sequence database.
ī‚§ It uses the information in CATH to predict
the locations of structural domains on
millions of protein sequences available in
public databases.
ī‚§ Provides comprehensive structural and
fuctional annotation of most available
protein sequence including the Uniprot,
Refseq and Integr 8 resources.
References:
-Bioinformatics by Sabu M Thampi
-Bioinformatics by Dardel
-Bioinformatics for Biologists by Dr. Murtada
Alshareifi
-https://bioinf.comav.upv.es
Thank you

More Related Content

What's hot

European molecular biology laboratory (EMBL)
European molecular biology laboratory (EMBL)European molecular biology laboratory (EMBL)
European molecular biology laboratory (EMBL)Hafiz Muhammad Zeeshan Raza
 
Sequence homology search and multiple sequence alignment(1)
Sequence homology search and multiple sequence alignment(1)Sequence homology search and multiple sequence alignment(1)
Sequence homology search and multiple sequence alignment(1)AnkitTiwari354
 
UniProt
UniProtUniProt
UniProtAmnaA7
 
Nucleic acid and protein databanks
Nucleic acid and protein databanksNucleic acid and protein databanks
Nucleic acid and protein databanksNithyaNandapal
 
Nucleic Acid Sequence databases
Nucleic Acid Sequence databasesNucleic Acid Sequence databases
Nucleic Acid Sequence databasesPranavathiyani G
 
Biological Databases
Biological DatabasesBiological Databases
Biological DatabasesShweta Kagliwal
 
Cath
CathCath
CathRamya S
 
Tools and database of NCBI
Tools and database of NCBITools and database of NCBI
Tools and database of NCBISantosh Kumar Sahoo
 
Bioinformatics biological databases
Bioinformatics biological databasesBioinformatics biological databases
Bioinformatics biological databasesSangeeta Das
 
Introduction to ncbi, embl, ddbj
Introduction to ncbi, embl, ddbjIntroduction to ncbi, embl, ddbj
Introduction to ncbi, embl, ddbjKAUSHAL SAHU
 
Primary and secondary database
Primary and secondary databasePrimary and secondary database
Primary and secondary databaseKAUSHAL SAHU
 
Protein databases
Protein databasesProtein databases
Protein databasessarumalay
 

What's hot (20)

PIR- Protein Information Resource
PIR- Protein Information ResourcePIR- Protein Information Resource
PIR- Protein Information Resource
 
Kegg
KeggKegg
Kegg
 
Protein Databases
Protein DatabasesProtein Databases
Protein Databases
 
European molecular biology laboratory (EMBL)
European molecular biology laboratory (EMBL)European molecular biology laboratory (EMBL)
European molecular biology laboratory (EMBL)
 
Sequence homology search and multiple sequence alignment(1)
Sequence homology search and multiple sequence alignment(1)Sequence homology search and multiple sequence alignment(1)
Sequence homology search and multiple sequence alignment(1)
 
(Expasy)
(Expasy)(Expasy)
(Expasy)
 
UniProt
UniProtUniProt
UniProt
 
Nucleic acid and protein databanks
Nucleic acid and protein databanksNucleic acid and protein databanks
Nucleic acid and protein databanks
 
Proteome databases
Proteome databasesProteome databases
Proteome databases
 
SWISS-PROT
SWISS-PROTSWISS-PROT
SWISS-PROT
 
Nucleic Acid Sequence databases
Nucleic Acid Sequence databasesNucleic Acid Sequence databases
Nucleic Acid Sequence databases
 
Biological Databases
Biological DatabasesBiological Databases
Biological Databases
 
Cath
CathCath
Cath
 
Tools and database of NCBI
Tools and database of NCBITools and database of NCBI
Tools and database of NCBI
 
Genomic databases
Genomic databasesGenomic databases
Genomic databases
 
Bioinformatics biological databases
Bioinformatics biological databasesBioinformatics biological databases
Bioinformatics biological databases
 
Introduction to ncbi, embl, ddbj
Introduction to ncbi, embl, ddbjIntroduction to ncbi, embl, ddbj
Introduction to ncbi, embl, ddbj
 
NCBI National Center for Biotechnology Information
NCBI National Center for Biotechnology InformationNCBI National Center for Biotechnology Information
NCBI National Center for Biotechnology Information
 
Primary and secondary database
Primary and secondary databasePrimary and secondary database
Primary and secondary database
 
Protein databases
Protein databasesProtein databases
Protein databases
 

Similar to Sequence and Structural Databases of DNA and Protein, and its significance in Scientific Researches

Biological databases
Biological databasesBiological databases
Biological databasesTamanna Syeda
 
Biological databases
Biological databasesBiological databases
Biological databasesSHRADHEYA GUPTA
 
Presentation on Biological database By Elufer Akram @ University Of Science ...
Presentation on Biological database  By Elufer Akram @ University Of Science ...Presentation on Biological database  By Elufer Akram @ University Of Science ...
Presentation on Biological database By Elufer Akram @ University Of Science ...Elufer Akram
 
Biological databases.pptx
Biological databases.pptxBiological databases.pptx
Biological databases.pptxPagudalaSangeetha
 
Bioinformatics introduction
Bioinformatics introductionBioinformatics introduction
Bioinformatics introductionDrGopaSarma
 
Protein Sequence Databases
Protein Sequence Databases Protein Sequence Databases
Protein Sequence Databases Hemant Bothe
 
Bioinformatics in biotechnology by kk sahu
Bioinformatics in biotechnology by kk sahu Bioinformatics in biotechnology by kk sahu
Bioinformatics in biotechnology by kk sahu KAUSHAL SAHU
 
BIOINFO unit 1.pptx
BIOINFO unit 1.pptxBIOINFO unit 1.pptx
BIOINFO unit 1.pptxrnath286
 
Protein database
Protein  databaseProtein  database
Protein databaseKAUSHAL SAHU
 
Introduction OF BIOLOGICAL DATABASE
Introduction OF BIOLOGICAL DATABASEIntroduction OF BIOLOGICAL DATABASE
Introduction OF BIOLOGICAL DATABASEPrashantSharma807
 
BioInformatics Tools -Genomics , Proteomics and metablomics
BioInformatics Tools -Genomics , Proteomics and metablomicsBioInformatics Tools -Genomics , Proteomics and metablomics
BioInformatics Tools -Genomics , Proteomics and metablomicsAyeshaYousaf20
 
Database in bioinformatics
Database in bioinformaticsDatabase in bioinformatics
Database in bioinformaticsVinaKhan1
 
Primary Databases.pptx
Primary Databases.pptxPrimary Databases.pptx
Primary Databases.pptxSwarup Malakar
 
protein databases
 protein databases protein databases
protein databaseswasisyed
 

Similar to Sequence and Structural Databases of DNA and Protein, and its significance in Scientific Researches (20)

Biological database
Biological databaseBiological database
Biological database
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Presentation on Biological database By Elufer Akram @ University Of Science ...
Presentation on Biological database  By Elufer Akram @ University Of Science ...Presentation on Biological database  By Elufer Akram @ University Of Science ...
Presentation on Biological database By Elufer Akram @ University Of Science ...
 
Biological databases.pptx
Biological databases.pptxBiological databases.pptx
Biological databases.pptx
 
Bioinformatics introduction
Bioinformatics introductionBioinformatics introduction
Bioinformatics introduction
 
Introduction to databases.pptx
Introduction to databases.pptxIntroduction to databases.pptx
Introduction to databases.pptx
 
Protein Sequence Databases
Protein Sequence Databases Protein Sequence Databases
Protein Sequence Databases
 
Bioinformatics in biotechnology by kk sahu
Bioinformatics in biotechnology by kk sahu Bioinformatics in biotechnology by kk sahu
Bioinformatics in biotechnology by kk sahu
 
BIOINFO unit 1.pptx
BIOINFO unit 1.pptxBIOINFO unit 1.pptx
BIOINFO unit 1.pptx
 
PROTEIN DATABASE
PROTEIN DATABASEPROTEIN DATABASE
PROTEIN DATABASE
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Protein database
Protein  databaseProtein  database
Protein database
 
Introduction OF BIOLOGICAL DATABASE
Introduction OF BIOLOGICAL DATABASEIntroduction OF BIOLOGICAL DATABASE
Introduction OF BIOLOGICAL DATABASE
 
BioInformatics Tools -Genomics , Proteomics and metablomics
BioInformatics Tools -Genomics , Proteomics and metablomicsBioInformatics Tools -Genomics , Proteomics and metablomics
BioInformatics Tools -Genomics , Proteomics and metablomics
 
Protein Database
Protein DatabaseProtein Database
Protein Database
 
Database in bioinformatics
Database in bioinformaticsDatabase in bioinformatics
Database in bioinformatics
 
Primary Databases.pptx
Primary Databases.pptxPrimary Databases.pptx
Primary Databases.pptx
 
Biological databases
Biological databases Biological databases
Biological databases
 
protein databases
 protein databases protein databases
protein databases
 

Recently uploaded

Work, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE PhysicsWork, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE Physicsvishikhakeshava1
 
Call Girls in Munirka Delhi đŸ’¯Call Us 🔝9953322196🔝 đŸ’¯Escort.
Call Girls in Munirka Delhi đŸ’¯Call Us 🔝9953322196🔝 đŸ’¯Escort.Call Girls in Munirka Delhi đŸ’¯Call Us 🔝9953322196🔝 đŸ’¯Escort.
Call Girls in Munirka Delhi đŸ’¯Call Us 🔝9953322196🔝 đŸ’¯Escort.aasikanpl
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSÊrgio Sacani
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
Call Us â‰Ŋ 9953322196 â‰ŧ Call Girls In Mukherjee Nagar(Delhi) |
Call Us â‰Ŋ 9953322196 â‰ŧ Call Girls In Mukherjee Nagar(Delhi) |Call Us â‰Ŋ 9953322196 â‰ŧ Call Girls In Mukherjee Nagar(Delhi) |
Call Us â‰Ŋ 9953322196 â‰ŧ Call Girls In Mukherjee Nagar(Delhi) |aasikanpl
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PPRINCE C P
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxpradhanghanshyam7136
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxAArockiyaNisha
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...SÊrgio Sacani
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptMAESTRELLAMesa2
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfSwapnil Therkar
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 sciencefloriejanemacaya1
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSÊrgio Sacani
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PPRINCE C P
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
Stunning âžĨ8448380779â–ģ Call Girls In Panchshil Enclave Delhi NCR
Stunning âžĨ8448380779â–ģ Call Girls In Panchshil Enclave Delhi NCRStunning âžĨ8448380779â–ģ Call Girls In Panchshil Enclave Delhi NCR
Stunning âžĨ8448380779â–ģ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
Call Girls in Mayapuri Delhi đŸ’¯Call Us 🔝9953322196🔝 đŸ’¯Escort.
Call Girls in Mayapuri Delhi đŸ’¯Call Us 🔝9953322196🔝 đŸ’¯Escort.Call Girls in Mayapuri Delhi đŸ’¯Call Us 🔝9953322196🔝 đŸ’¯Escort.
Call Girls in Mayapuri Delhi đŸ’¯Call Us 🔝9953322196🔝 đŸ’¯Escort.aasikanpl
 

Recently uploaded (20)

Work, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE PhysicsWork, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE Physics
 
Call Girls in Munirka Delhi đŸ’¯Call Us 🔝9953322196🔝 đŸ’¯Escort.
Call Girls in Munirka Delhi đŸ’¯Call Us 🔝9953322196🔝 đŸ’¯Escort.Call Girls in Munirka Delhi đŸ’¯Call Us 🔝9953322196🔝 đŸ’¯Escort.
Call Girls in Munirka Delhi đŸ’¯Call Us 🔝9953322196🔝 đŸ’¯Escort.
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
Call Us â‰Ŋ 9953322196 â‰ŧ Call Girls In Mukherjee Nagar(Delhi) |
Call Us â‰Ŋ 9953322196 â‰ŧ Call Girls In Mukherjee Nagar(Delhi) |Call Us â‰Ŋ 9953322196 â‰ŧ Call Girls In Mukherjee Nagar(Delhi) |
Call Us â‰Ŋ 9953322196 â‰ŧ Call Girls In Mukherjee Nagar(Delhi) |
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C P
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptx
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.ppt
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 science
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
Stunning âžĨ8448380779â–ģ Call Girls In Panchshil Enclave Delhi NCR
Stunning âžĨ8448380779â–ģ Call Girls In Panchshil Enclave Delhi NCRStunning âžĨ8448380779â–ģ Call Girls In Panchshil Enclave Delhi NCR
Stunning âžĨ8448380779â–ģ Call Girls In Panchshil Enclave Delhi NCR
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Call Girls in Mayapuri Delhi đŸ’¯Call Us 🔝9953322196🔝 đŸ’¯Escort.
Call Girls in Mayapuri Delhi đŸ’¯Call Us 🔝9953322196🔝 đŸ’¯Escort.Call Girls in Mayapuri Delhi đŸ’¯Call Us 🔝9953322196🔝 đŸ’¯Escort.
Call Girls in Mayapuri Delhi đŸ’¯Call Us 🔝9953322196🔝 đŸ’¯Escort.
 

Sequence and Structural Databases of DNA and Protein, and its significance in Scientific Researches

  • 1. S.BITUILA II MSC. Sequence and structural databases of Dna and protein , and its significances in scientific researches.
  • 2. DNA Databases: ī‚§ Sequence Databases ī‚§ Structural Databases
  • 3. DNA Sequence Databases: ī‚§ NCBI ī‚§ EMBL ī‚§ DDBJ ī‚§ Ensembl ī‚§ GenBank ī‚§ EBI ī‚§ UniGene
  • 4. NCBI (National Centre for Biotechnological Information) ī‚§ Established in the year 1988 ī‚§ It aims to create public databases , develop software tools for sequence analysis and disseminate biomedical information, mainly to aid the research in computational biology. ī‚§ Roles: -Maintains several biological databases eg.GenBank,the nucleic acid sequence database. -provides data retrieval system (eg.Entrez) -provides computational resources for the analysis of GenBank data and a variety of other biological databases.
  • 5. Tools available in NCBI: ī‚§ BLAST,Entrez,standard BLAST,megaBLAST, mega BLAST,PSI- BLAST,RPS-BLAST ī‚§ Types of Databases : -Nucleotide database -Literature database -protein database -Gene expression -Structural database -Chemical and others.
  • 6.
  • 7. EMBL(European Molecular Biology Laboratory) ī‚§ Established in the year 1974 by Leo Sjilard , James Watson and John Kendrew. ī‚§ Roles: -Incorporates , Organizes and Distributes nucleotide sequences from the public sources. -Performs basic researches in molecular biology and medicine as well as trains Scientists, students and visitors. ī‚§ Tools: -Ppsearch,GeneQuiz,FASTA,DALI,BLAST- 2,Radar,Dali-Lite etc.
  • 8.
  • 9. DDBJ(DNA Databank of Japan) ī‚§ Established in the year 1986 ī‚§ Roles: -Collects nucleotide sequence data and provides freely available nucleotide sequence data. -Provides supercomputer system to support research activities in Life Sciences. Tools: -Getentry,SRS,TXSearch,LIBRA,GIB.
  • 10.
  • 11. Ensemble: ī‚§ Launched in the year 1999 in response to the imminent completion of the Human Genome Project. ī‚§ Joint Project between the European Bioinformatics Institute and the welcome Trust Sanger Institute. ī‚§ It aims to provide a centralized resource for geneticists, molecular biologists and other researchers studying the genomes of our own species and other vertebrates and model organisms. ī‚§ Genome databases for vertebrates and other eukaryotic species . ī‚§ It is one of the well known genome browsers for the retrieval of genomic information. ī‚§ Plays a major role in ENCODE (Encyclopaedia of DNA Elements Consortium) Project. ī‚§ Tools: BLAST ,Data Slicer, Variant Effect Predictor, Assembly converter etc.
  • 12. GenBank: ī‚§ Started in the year 1982 by Walter Goad and Los Alamos National Laboratory. ī‚§ Produced and maintained by the National Centre for Biotechnology Information (NCBI) as a part of the International Nucleotide Sequence Database Collaboration(INSDC) ī‚§ Roles: -open access ,annotated collection of all publicly available nucleotide sequences and their protein translations. -Provide and encourage access within the scientific community. ī‚§ Tools: Bar S Tool, Sequin, BLAST,
  • 13. EBI(European Bioinformatics Institute): ī‚§ 1980 ī‚§ EMBL-EBI is a centre for research and services in bioinformatics ,and is a part of European Molecular Biology Laboratory(EMBL) ī‚§ It hosts a number of publicly open ,free to use life sciences resources ,including biomedical databases, analysis tools and bio-ontologies which includes-; - ArrayExpress -archive of gene expression experiments. - BioModels - a database of computational models relevant to the life sciences. - BioStudies -a database that serves as a generic data archive at EMBL-EBI for biomolecular datasets. -European Nucleotide Archive (ENA) – resource of Nucleotide sequencing information.
  • 14.
  • 15. UniGene: ī‚§ It is an NCBI database of the transcriptome and thus ,despite the name not primarily a database for genes. ī‚§ It provides informations on protein similarities, gene expression , cDNA clones and genomic location .
  • 17. RNase P Database: ī‚§ Compilation of RNase P sequences, sequence alignments , secondary structures, three dimensional models and accessory information. ī‚§ Also contains secondary structures of bacterial and archaeal RNAs including specially annotated ‘reference’ secondary structures of E.Coli and Bacillus subtilis RNase P RNAs,a minimum phylogenetic consensus structure,and coordinates for models of three-dimensional structure.
  • 18.
  • 19. Protein Databases: ī‚§ Protein Sequence Databases ī‚§ Protein Structural Databases
  • 20. Protein Sequence Databases: ī‚§ PIR ī‚§ SWISS-PROT ī‚§ Trembl ī‚§ iProclass ī‚§ Pfam
  • 21. PIR(Protein Information Resource): ī‚§ 1984 by the National Biomedical Research Foundation(NBRF) ī‚§ Roles: -Source of annotated proteins database and analysis tools for the researchers. ī‚§ Provides an introduction to a range of biological database. ī‚§ Highlights the distinction between different data types and indicates where the most important resources are maintained. -It also supports genomic and proteomic research and scientific discovery.
  • 22. PIR is split into four sections: ī‚§ PIR1: contains fully classified and annotated entries. ī‚§ PIR2: includes preliminary entries ,which have not been thoroughly reviewed and may contain redundancy . ī‚§ PIR3 contains unverified entries ,which have not been reviewedPIR4 entries fall into one of the four categories: -conceptual translations of artefactual sequences -conceptual translations of sequences that are not transcribed or translated -protein sequences or conceptual translations that are extensively genetically engineered -Sequences that are not genetically encoded and not produced on ribosomes.
  • 23.
  • 24. SWISS-Prot: ī‚§ Founded in the year 1986 by Amos Bairoch and developed by Swiss Institute of Bioinformatics and subsequently developed by Rolf Apwelier at EBI. ī‚§ Provides high level annotations, including descriptions of the function of the protein, structure of its domains, its post translational modifications variants etc. ī‚§ Minimal redundancy and integration with other databases .
  • 25. TrEMBL(Translated EMBL) ī‚§ Founded in the year 1996 as a computer annotated supplement to Swiss-Prot. ī‚§ Contains translation of all coding sequences present in EMBL, GenBank, DDBJ Nucleotide Sequence Databases and also protein extracted from the literature or submitted to Swiss- Prot.
  • 26.
  • 27. iPro-class (Integrated Protein Knowledge bases) -First released in 2000 - Provides comprehensive description of a protein family ,function and structure for Uniprot protein sequence. ī‚§ It contains Value added descriptions of proteins including family relationship at global and local levels. ī‚§ Serves as a framework for data integration in distributed networking environment. ī‚§ It can also be used to support protein sequence annotation and genomic/proteomic research to obtain comprehensive up-to-date information on proteins.
  • 28. Uses: ī‚§ iPro-class provides two types of protein sequence reports. In one type it covers information on genetic gene family structure function, taxonomy and literature with cross reference to molecular database .The second type present PIR super family membership information with length ,taxonomy and keyword statistics. ī‚§ It also provides links to various molecular biology databases.
  • 29.
  • 30. Pfam ī‚§ 1995 by Erik Sonhammer , Sean Eddy and Richard Durbin as a collection of commonly occurring protein domains that could be used to annotate the protein coding genes of multicellular animals. ī‚§ It is a database of protein families. ī‚§ Includes annotations and multiple sequence alignment of protein families generated using hidden Markov models. ī‚§ The general purpose of Pfam database is to provide a complete and accurate classification of protein families. ī‚§ This method has been widely adopted by biologists because of its wide coverage of proteins and sensible naming conventions.
  • 31.
  • 32. Uses : ī‚§ It is used by experimental biologists researching specific proteins ,by structural biologists to identify new targets for structure determination, by computational biologists to organize sequences and by evolutionary biologists for tracing the origins of proteins. ī‚§ It also allows users to submit protein or DNA sequences to search for matches to families in the database.
  • 33. Structural Databases of protein ; ī‚§ PDB ī‚§ CATH ī‚§ SCOP ī‚§ Gene 3D ī‚§ D Bali ī‚§ E-MSD
  • 34. PDB(Protein DataBank); ī‚§ 1971, by Brookhaven National Laboratory ,New York. ī‚§ It is a database for the three –dimensional structural data of large biological molecules, and nucleic acids. ī‚§ Roles: -It is a key resource in areas of structural biology ,such as structural genomics . -Provides protein structures to many other databases eg SCOP and CATH. ī‚§ Tools: -ADIT(auto Deep Input Tool), pdb-Extract, OOSTAR, Open Ras Mol, CIF Tr, MAXIT, Biopython, mmLIB,XML2PDB,
  • 35.
  • 36. CATH( Class, Architecture, Topology and Homology) ī‚§ Mid 1990s by Professor Christine Orengo and colleagues including Janet Thornton and David Jones at the University College London. -It is a protein Structure Classification Database. and shares many broad features with the SCOP resource. -It provides information on the Evolutionary relationships of protein domains . ī‚§ Roles: -Class; at this level the domains are assigned according to their secondary structure content . -Architecture , at this level , information on the secondary structure arrangement in three dimensional space is used for assignment. It describes the gross secondary structure content and packing. -Topology encompasses both overall shape and connectivity of secondary structure -Homology groups domains that share more than 35% sequence identity and thought to share a common ancestor.
  • 37. The four levels of CATH hierarchy: # Level Description 1. Class: The overall secondary structure content of the domain . 2. Architectur e: High structural similarity but no evidence of homology . 3. Topology: A large-Scale grouping of topologies which share particular structural features 4. Homolog- ous superfam- ily Indicative of a demonstrable evolutionary relationship
  • 38.
  • 39. SCOP( Structural Classification of Protein) ī‚§ 1994 ī‚§ Centre for Engineering and the Laboratory of Molecular Biology. ī‚§ Roles: -Describes Structural and Evolutionary relationship between proteins of known structure. -Provides broad survey of all known proteins folds , detailed information about the close relatives of protein and a protein and a framework for future research and classification.
  • 40.
  • 41. E-MSD ī‚§ 1996 ī‚§ Provides clean Macromolecular Structure Data ī‚§ Accept and process depositions to the PDB. ī‚§ Transform the PDB flat –file archive to a relational database system. ī‚§ Management and distribution of data on molecular structures in close collaboration with PDB. ī‚§ Tools- Autodep and Emdep
  • 42. Gene 3D: ī‚§ Provides structural annotation for proteins in the CATH sequence database. ī‚§ It uses the information in CATH to predict the locations of structural domains on millions of protein sequences available in public databases. ī‚§ Provides comprehensive structural and fuctional annotation of most available protein sequence including the Uniprot, Refseq and Integr 8 resources.
  • 43. References: -Bioinformatics by Sabu M Thampi -Bioinformatics by Dardel -Bioinformatics for Biologists by Dr. Murtada Alshareifi -https://bioinf.comav.upv.es