Madhubala.S
Assistant Professor
Department of Biotechnology
Sri Adi Chunchanagiri Women’s
College, Cumbum.
Nucleic acid database
 The nucleic acid database is a web portal providing access to
information about 3D nucleic acid structure and their
complex
 It was found in 1991 and distribute structural information
about nucleic acids
 The Nucleic acid databanks was established the focus was on
DNA structural biology
 Nucleic acid databanks has developed generalized software
for processing, archiving, querying and distributing
structural data for nucleic acid
NCBI ( national center for
biotechnological informations)
 NCBI is a part of United States National Library Of Medicine
(NLM) a branch of national institute of Health (NIH).
 It was found in 1988 at Bethesda, Maryland.
 NCBI has a series of data relevant to Biotechnology and
biomedicine and is an important bioinformatics tool.
 all the recorded databases can be access through Enterz.
 NCBI is a collection of freely accessible, downloadable, on-line
version of selected biomedical books
 NCBI along with EBI & CIB together form international sequence
database
 Major collaborative databases includes Gene Bank, EMBL, DDBJ
 Major NCBI databases includes Pubmed for bibliographic
database, NCBI epigenomics,
Conti..
 BLAST is the tool for finding sequence similarity to the query sequence. It
searches query sequence from NCBI database
 NCBI-BLAST results can be represented in graphical format
 HTML is the default output format of NCBI webpage
 Enterz is cross database search system used at NCBI for all major
databases such as DNA, protein sequences, protein structure, Pubmed,
OMIM. Etc
 NCBI distributes first version of Enterz in 1991
 NCBI has implement ‘Gene’ to characterise and organise into about gene
 It serves major nodes in Texas of genomic map, expression, sequence
finding etc.
 NCBI has imported 3D structures in PDB
 Pubchem database is a public resource of molecule and their activities in
NCBI
EMBL
 EMBL ( European Molecular Biology Laboratory )
 It is a nucleotide sequence database created in 1974
 It is a molecular biology research institute
 http::// WWW.ebi.ac.uk/embl/index.html
 It was maintained by EBI( European Bioinformtics institute ) in an
international collaboration with DDBJ and Gen Bank
 Data exchanged among collaborated database on daily basis
 Individual authors and genome projects groups are the major
source of information for EMBL
 Many sequence similarity searching tools are available in EMBL
DDBJ
 DDBJ( DNA Data bank of Japan )
 It is a biological database that collect DNA sequencing
 Located at national institute of genetics ( NIG) in Japan
 It was functioned first in 1986
 They receive information from Japanese researchers
 they were funded by Ministry of education, Culture, Science and
technology of Japan (MEXT)
 They collect nucleotide sequence data as a member of INSDC(
International Nucleotide Sequence Database Collaboration)
 The information collected various source in DDBJ are freely
accessible for researchers
PROTEIN DATA BANKS
 The Protein Data Bank (PDB) is a database for the three-dimensional
structural data of large biological molecules, such as proteins and nucleic
acids
 PDB has three official branches: the Research Collaboratory for Structural
Bioinformatics (RCSB, USA), the European Bioinformatics Institute
(PDBe, UK), and the Protein Data Bank Japan (PDBj, Osaka).
 PDB access is provided through primary web and ftp sites (www.pdb.org,
ftp.pdb.org) or via multiple mirror sites distributed worldwide.
 The Protein Data Bank (PDB) was established in 1971 with fewer than ten
X-ray crystallographic structures of proteins, becoming the first open
access digital data resource in the biological sciences
 The Protein Data Bank (pdb) file format is a textual file format describing
the three-dimensional structures of molecules held in the Protein Data
Bank.
Conti ….
 The Protein Data Bank (PDB) archive is the single worldwide repository
of information about the 3D structures of large biological molecules,
including proteins and nucleic acids.
 The RCSB PDB has an international community of users, including
biologists (in fields such as structural biology, biochemistry, genetics,
pharmacology)
 Other scientists (in fields such as bioinformatics, software developers for
data analysis and visualization)
 Students and educators (all levels); media writers, illustrators, textbook
authors; and the general public.
 The RCSB PDB Advisory Committee is made up of an international team
of experts in X-ray crystallography, cryoEM, NMR, bioinformatics and
education. RCSB PDB appreciates the valuable feedback they provide on
an ongoing basis.
Swiss –prot
 Universal protein resources knowledge base (Uniprot) is the central
hub for the collection of functional information of protein
 Swiss-prot is mainly annotated and received section of Uniprot
 It was established in 1986
 And maintained by Swiss institute for Bioinformatics and EBI
 They provide protein sequence database that provide a high level
of annotation, minimal redundancy and integration with other
database
 Data can be diffentiated into two types core and annotation data
 Core contains citation informations that is bibliographic reference
and taxonomic information that is biological source of protein
 Annotation data contains function of protein, post translational
modification, Domains and sites, secondary structure, quaternary
structure and disease associated
Minimal redundancy
 Much of the data comes from more than one literature report
 Data condensed and merged to appear more concise and coherent
 Conflicts in data are listed for each entry
Integration with other data
 Swiss –prot provides cross reference to external data collection
 Integration between the three types of sequence related database( nucleic
acid sequence, protein sequence and protein tertiary structure)
Thank
you

Nucleic acid and protein databanks

  • 1.
    Madhubala.S Assistant Professor Department ofBiotechnology Sri Adi Chunchanagiri Women’s College, Cumbum.
  • 2.
    Nucleic acid database The nucleic acid database is a web portal providing access to information about 3D nucleic acid structure and their complex  It was found in 1991 and distribute structural information about nucleic acids  The Nucleic acid databanks was established the focus was on DNA structural biology  Nucleic acid databanks has developed generalized software for processing, archiving, querying and distributing structural data for nucleic acid
  • 4.
    NCBI ( nationalcenter for biotechnological informations)  NCBI is a part of United States National Library Of Medicine (NLM) a branch of national institute of Health (NIH).  It was found in 1988 at Bethesda, Maryland.  NCBI has a series of data relevant to Biotechnology and biomedicine and is an important bioinformatics tool.  all the recorded databases can be access through Enterz.  NCBI is a collection of freely accessible, downloadable, on-line version of selected biomedical books  NCBI along with EBI & CIB together form international sequence database  Major collaborative databases includes Gene Bank, EMBL, DDBJ  Major NCBI databases includes Pubmed for bibliographic database, NCBI epigenomics,
  • 5.
    Conti..  BLAST isthe tool for finding sequence similarity to the query sequence. It searches query sequence from NCBI database  NCBI-BLAST results can be represented in graphical format  HTML is the default output format of NCBI webpage  Enterz is cross database search system used at NCBI for all major databases such as DNA, protein sequences, protein structure, Pubmed, OMIM. Etc  NCBI distributes first version of Enterz in 1991  NCBI has implement ‘Gene’ to characterise and organise into about gene  It serves major nodes in Texas of genomic map, expression, sequence finding etc.  NCBI has imported 3D structures in PDB  Pubchem database is a public resource of molecule and their activities in NCBI
  • 8.
    EMBL  EMBL (European Molecular Biology Laboratory )  It is a nucleotide sequence database created in 1974  It is a molecular biology research institute  http::// WWW.ebi.ac.uk/embl/index.html  It was maintained by EBI( European Bioinformtics institute ) in an international collaboration with DDBJ and Gen Bank  Data exchanged among collaborated database on daily basis  Individual authors and genome projects groups are the major source of information for EMBL  Many sequence similarity searching tools are available in EMBL
  • 12.
    DDBJ  DDBJ( DNAData bank of Japan )  It is a biological database that collect DNA sequencing  Located at national institute of genetics ( NIG) in Japan  It was functioned first in 1986  They receive information from Japanese researchers  they were funded by Ministry of education, Culture, Science and technology of Japan (MEXT)  They collect nucleotide sequence data as a member of INSDC( International Nucleotide Sequence Database Collaboration)  The information collected various source in DDBJ are freely accessible for researchers
  • 16.
    PROTEIN DATA BANKS The Protein Data Bank (PDB) is a database for the three-dimensional structural data of large biological molecules, such as proteins and nucleic acids  PDB has three official branches: the Research Collaboratory for Structural Bioinformatics (RCSB, USA), the European Bioinformatics Institute (PDBe, UK), and the Protein Data Bank Japan (PDBj, Osaka).  PDB access is provided through primary web and ftp sites (www.pdb.org, ftp.pdb.org) or via multiple mirror sites distributed worldwide.  The Protein Data Bank (PDB) was established in 1971 with fewer than ten X-ray crystallographic structures of proteins, becoming the first open access digital data resource in the biological sciences  The Protein Data Bank (pdb) file format is a textual file format describing the three-dimensional structures of molecules held in the Protein Data Bank.
  • 17.
    Conti ….  TheProtein Data Bank (PDB) archive is the single worldwide repository of information about the 3D structures of large biological molecules, including proteins and nucleic acids.  The RCSB PDB has an international community of users, including biologists (in fields such as structural biology, biochemistry, genetics, pharmacology)  Other scientists (in fields such as bioinformatics, software developers for data analysis and visualization)  Students and educators (all levels); media writers, illustrators, textbook authors; and the general public.  The RCSB PDB Advisory Committee is made up of an international team of experts in X-ray crystallography, cryoEM, NMR, bioinformatics and education. RCSB PDB appreciates the valuable feedback they provide on an ongoing basis.
  • 21.
    Swiss –prot  Universalprotein resources knowledge base (Uniprot) is the central hub for the collection of functional information of protein  Swiss-prot is mainly annotated and received section of Uniprot  It was established in 1986  And maintained by Swiss institute for Bioinformatics and EBI  They provide protein sequence database that provide a high level of annotation, minimal redundancy and integration with other database  Data can be diffentiated into two types core and annotation data  Core contains citation informations that is bibliographic reference and taxonomic information that is biological source of protein  Annotation data contains function of protein, post translational modification, Domains and sites, secondary structure, quaternary structure and disease associated
  • 22.
    Minimal redundancy  Muchof the data comes from more than one literature report  Data condensed and merged to appear more concise and coherent  Conflicts in data are listed for each entry Integration with other data  Swiss –prot provides cross reference to external data collection  Integration between the three types of sequence related database( nucleic acid sequence, protein sequence and protein tertiary structure)
  • 24.