SlideShare a Scribd company logo
1 of 19
Presented by – SWARUP MALAKAR
A database is a repository of sequence ( DNA or amino acids ) stored in a
computer which provide a centralized and homogenous view of its content.
or, it is a vast collection of data pertaining to a specific topic, e.g.,
nucleotide sequence, protein sequence etc.
Basically, it is an electronic environment.
Databases are at the heart of bioinformatics.
1. Sequence databases: - that involves the sequences of both proteins and nucleic
acids.
2. Structural databases:- that involves only protein databases.
In additionally, it is also classified into three categories:
A. Primary database B. Secondary databases C. Composite databases.
It contain information of the sequence or structure alone either protein or
nucleic acid .
Example: PIR, SWISS-PROT for protein sequences , NCBI, EMBL and DDBJ for
genome sequences.
PIR: It is functionally annotated
protein sequences and structure.
PIR has collaborated with EBI and
SIB to establish the UniProt (
United Protein Databases).
The central resource of
protein sequence and function.
TREMBL
NCBI ( National Centre of Biotechnology Information ):
- Nov 4, 1988 , the NCBI was established as division of the National Library of medicine for the
development of information systems in molecular biology.
- The NCBI is located in Bethesta, Maryland (U.S.A).
- NCBI built the GenBank, which is an annotated collection of publically available nucleotide and
protein sequences.
- In 1988, the three partners (DDBJ, EMBL and GenBank) of the international Nucelotide
Sequences Database collaboration had a meeting and agreed to use a common format.
i. Maintains collaboration with several NIH institutes, academia, industry and other governmental
agencies.
ii. Develops, distributes, supports and coordinates access to a variety of databases and software for
the scientific and medical communities.
iii. Develops and promotes standards for databases, data deposition and exchange, and biological
nomenclature.
iv. Engages the members of the international scientific community in informatics research and training
through the scientific visitors programs.
Link: https://www.ncbi.nlm.nih.gov/
 In 1992, NCBI has the responsibility for making available the
DNA sequence database to the GenBank.
 Coordinates with individual laboratories and other sequence
data base such those of EMBL and DDBJ.
 Moreover, NCBI has grown to provide other databases in
addition to GenBank.
 GenBank is a comprehensive sequence database that contains
publicly available DNA sequences for more than 1,19,000
different organisms obtained through the submission of
sequence data from individual lab and batch submissions from
large-scale of seq. projects.
 Daily data exchange with the EMBL data library in the UK and
the DNA Data Bank of Japan helps world wide coverage.
 Developed and maintained by European Molecular Biology Laboratory – European
Bioinformatics Institute (EMBL-EBI).
 Comprehensive data nucleotide sequence information.
 The European Molecular Biology Laboratory (EMBL) Nucleotide Sequence Database is a
comprehensive collection of primary nucleotide sequences maintained at the European
Bioinformatics Institute (EBI).
 Link: http:www.ebi.ac.uk/embl/
EMBL is supported by 22 member states, four prospect, and two associated states.
 The laboratory operatory operates from five sites: the main laboratory in Heidelberg, and
outstations Hinxton (EBI, in England), Grenoble (France), Hambury (Germany) and
Manterotando ( near Rome).
 EMBL groups and laboratories perform basic research in molecular biology and molecular
medicine as well as training for science student and visitors.
 Since 1982 this work has been done in collaboration with GenBank (NCBI, Bethesda, USA)
and the DNA Database of Japan (Mishima).
 For sequencing similar searching, a variety of tools (FASTA and BLAST
are available that allow external users to compare their own seq. against the data in
EMBL nucleotide sequence database and other database.
 The DNA Data Bank of Japan (DDBJ) is a biological database that collects DNA
sequences. It was established in 1986.
 Link: https://www.ddbj.nig.ac.jp
 It is located at the National Institute of Genetics (NIG) in the Shizuoka prefecture of
Japan.
 DDBJ is a member of the International Nucleotide Sequence Database
Collaboration or INSDC.
 It exchanges its data with European Molecular Biology Laboratory at the European
Bioinformatics Institute and with GenBank at the National Center for Biotechnology
Information on a daily basis.
 DDBJ Center collects nucleotide sequence data as a member of INSDC(International
Nucleotide Sequence Database Collaboration) and provides freely available nucleotide
sequence data and supercomputer system, to support research activities in life science.
 FEATURES
 group 1: biological source of the sequence (source) The feature, “source” (group 1) is
mandatory for all entries in the international nucleotide database. ...
 group 2: biological function features of the region. ...
 group 3: difference and/or change of the sequence data.
Data type Organism Accession numbers for annotated
sequences (number of entries)
Accession numbers for raw reads
Genome Radish (Raphanus sativus cv. Aokubi S-
h)
WGS: BAOO01000001-
BAOO01072909 (72 909 entries)
scaffold CON: DF196826-
DF236948 (40,123 entries)
DRR012610-DRR012624
Soybean (Glycine max cv. Enrei) BBNX02000001-BBNX02108601 (108
601 entries)
DRR021740-DRR021744
Common marmoset (Callithrix jacchus) WGS: BBXK01000001-
BBXK01109198 (109 198 entries)
scaffold CON: DG000097-
DG000120 (24 entries)
GSS: LB274659-LB427105 (152 447
entries)
DRR036754-DRR036764
List of notable data sets released from the DNA Data Bank of Japan (DDBJ) sequence databases from June 2015 to May 2016
 Hosted at National Institute of Genetics .
 Mainly from scientists in Japan and also from resources all over the world and shave this
nucleotide data with EMBL and GenBank.
 This officially , certified to collect nucleotide sequence from researchers sand to tissue the
internationally recognized number of data submitters.
 About 99% of the nucleotide data in INSDC are submitted by DDMJ
 This database plays a major role to improve the quality of INSDC.
 Each database entry include details of sequences, submitters details bibiliographic
references, biological significance and the scientific name and taxonomy of the organism.
 Features that identify coding regions transcription units, mutation sites etc. are displayed
in a feature table. Major activities of the database.
 Providing internationally recognized accession numbers to sequences.
 Bioinformatics database management developing tools for the analysis and visualization of
biological data.
 Conducting courses for beginners to reduce the complexity in the biological data analysis.
Databases for Protein and Nucleic Acid Sequences
Databases for Protein and Nucleic Acid Sequences

More Related Content

What's hot (20)

Biological databases
Biological databasesBiological databases
Biological databases
 
Gen bank databases
Gen bank databasesGen bank databases
Gen bank databases
 
Major databases in bioinformatics
Major databases in bioinformaticsMajor databases in bioinformatics
Major databases in bioinformatics
 
Protein database
Protein databaseProtein database
Protein database
 
History and scope in bioinformatics
History and scope in bioinformaticsHistory and scope in bioinformatics
History and scope in bioinformatics
 
EMBL
EMBLEMBL
EMBL
 
Introduction to ncbi, embl, ddbj
Introduction to ncbi, embl, ddbjIntroduction to ncbi, embl, ddbj
Introduction to ncbi, embl, ddbj
 
UniProt
UniProtUniProt
UniProt
 
Protein data bank
Protein data bankProtein data bank
Protein data bank
 
Biological database
Biological databaseBiological database
Biological database
 
BLAST (Basic local alignment search Tool)
BLAST (Basic local alignment search Tool)BLAST (Basic local alignment search Tool)
BLAST (Basic local alignment search Tool)
 
Ddbj
DdbjDdbj
Ddbj
 
Entrez databases
Entrez databasesEntrez databases
Entrez databases
 
Bioinformatics data mining
Bioinformatics data miningBioinformatics data mining
Bioinformatics data mining
 
Data Retrieval Systems
Data Retrieval SystemsData Retrieval Systems
Data Retrieval Systems
 
Blast and fasta
Blast and fastaBlast and fasta
Blast and fasta
 
Gene prediction and expression
Gene prediction and expressionGene prediction and expression
Gene prediction and expression
 
TrEMBL
TrEMBLTrEMBL
TrEMBL
 
Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins
 
Primary and secondary database
Primary and secondary databasePrimary and secondary database
Primary and secondary database
 

Similar to Databases for Protein and Nucleic Acid Sequences

Bioinformatics in biotechnology by kk sahu
Bioinformatics in biotechnology by kk sahu Bioinformatics in biotechnology by kk sahu
Bioinformatics in biotechnology by kk sahu KAUSHAL SAHU
 
Nucleic Acid Databases (NDB ) of bioinformatics pptx
Nucleic Acid Databases (NDB ) of bioinformatics pptxNucleic Acid Databases (NDB ) of bioinformatics pptx
Nucleic Acid Databases (NDB ) of bioinformatics pptxkarmandeepkaur7
 
Nucleic acid and protein databanks
Nucleic acid and protein databanksNucleic acid and protein databanks
Nucleic acid and protein databanksNithyaNandapal
 
Database in bioinformatics
Database in bioinformaticsDatabase in bioinformatics
Database in bioinformaticsVinaKhan1
 
databases.pptx
databases.pptxdatabases.pptx
databases.pptxifra27
 
Presentation on Biological database By Elufer Akram @ University Of Science ...
Presentation on Biological database  By Elufer Akram @ University Of Science ...Presentation on Biological database  By Elufer Akram @ University Of Science ...
Presentation on Biological database By Elufer Akram @ University Of Science ...Elufer Akram
 
Bioinformatics biological databases
Bioinformatics biological databasesBioinformatics biological databases
Bioinformatics biological databasesSangeeta Das
 
Bioinformatics
BioinformaticsBioinformatics
BioinformaticsRaj Varun
 
Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...SBituila
 
Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...BibiQuinah
 
Biological Databases | Access to sequence data and related information
Biological Databases | Access to sequence data and related information Biological Databases | Access to sequence data and related information
Biological Databases | Access to sequence data and related information NahalMalik1
 
Primary Bioinformatics Database.pptx
Primary Bioinformatics Database.pptxPrimary Bioinformatics Database.pptx
Primary Bioinformatics Database.pptxVandana Yadav03
 
Primary sequencing of nucleic acids
Primary sequencing of nucleic acidsPrimary sequencing of nucleic acids
Primary sequencing of nucleic acidsvibhakumari12
 
Nucleic Acid Sequence databases
Nucleic Acid Sequence databasesNucleic Acid Sequence databases
Nucleic Acid Sequence databasesPranavathiyani G
 

Similar to Databases for Protein and Nucleic Acid Sequences (20)

Bioinformatics in biotechnology by kk sahu
Bioinformatics in biotechnology by kk sahu Bioinformatics in biotechnology by kk sahu
Bioinformatics in biotechnology by kk sahu
 
Biological databases.pptx
Biological databases.pptxBiological databases.pptx
Biological databases.pptx
 
Nucleic Acid Databases (NDB ) of bioinformatics pptx
Nucleic Acid Databases (NDB ) of bioinformatics pptxNucleic Acid Databases (NDB ) of bioinformatics pptx
Nucleic Acid Databases (NDB ) of bioinformatics pptx
 
Nucleic acid and protein databanks
Nucleic acid and protein databanksNucleic acid and protein databanks
Nucleic acid and protein databanks
 
Database in bioinformatics
Database in bioinformaticsDatabase in bioinformatics
Database in bioinformatics
 
databases.pptx
databases.pptxdatabases.pptx
databases.pptx
 
Presentation on Biological database By Elufer Akram @ University Of Science ...
Presentation on Biological database  By Elufer Akram @ University Of Science ...Presentation on Biological database  By Elufer Akram @ University Of Science ...
Presentation on Biological database By Elufer Akram @ University Of Science ...
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Introduction to databases.pptx
Introduction to databases.pptxIntroduction to databases.pptx
Introduction to databases.pptx
 
Bioinformatics biological databases
Bioinformatics biological databasesBioinformatics biological databases
Bioinformatics biological databases
 
Data base in detail
Data base in detailData base in detail
Data base in detail
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...
 
Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...
 
Biological Databases | Access to sequence data and related information
Biological Databases | Access to sequence data and related information Biological Databases | Access to sequence data and related information
Biological Databases | Access to sequence data and related information
 
Primary Bioinformatics Database.pptx
Primary Bioinformatics Database.pptxPrimary Bioinformatics Database.pptx
Primary Bioinformatics Database.pptx
 
Primary sequencing of nucleic acids
Primary sequencing of nucleic acidsPrimary sequencing of nucleic acids
Primary sequencing of nucleic acids
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Major biological nucleotide databases
Major biological nucleotide databasesMajor biological nucleotide databases
Major biological nucleotide databases
 
Nucleic Acid Sequence databases
Nucleic Acid Sequence databasesNucleic Acid Sequence databases
Nucleic Acid Sequence databases
 

Recently uploaded

RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxFarihaAbdulRasheed
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRlizamodels9
 
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |aasikanpl
 
Module 4: Mendelian Genetics and Punnett Square
Module 4:  Mendelian Genetics and Punnett SquareModule 4:  Mendelian Genetics and Punnett Square
Module 4: Mendelian Genetics and Punnett SquareIsiahStephanRadaza
 
‏‏VIRUS - 123455555555555555555555555555555555555555
‏‏VIRUS -  123455555555555555555555555555555555555555‏‏VIRUS -  123455555555555555555555555555555555555555
‏‏VIRUS - 123455555555555555555555555555555555555555kikilily0909
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...lizamodels9
 
TOTAL CHOLESTEROL (lipid profile test).pptx
TOTAL CHOLESTEROL (lipid profile test).pptxTOTAL CHOLESTEROL (lipid profile test).pptx
TOTAL CHOLESTEROL (lipid profile test).pptxdharshini369nike
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trssuser06f238
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxpriyankatabhane
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentationtahreemzahra82
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.PraveenaKalaiselvan1
 
Evidences of Evolution General Biology 2
Evidences of Evolution General Biology 2Evidences of Evolution General Biology 2
Evidences of Evolution General Biology 2John Carlo Rollon
 
Recombinant DNA technology( Transgenic plant and animal)
Recombinant DNA technology( Transgenic plant and animal)Recombinant DNA technology( Transgenic plant and animal)
Recombinant DNA technology( Transgenic plant and animal)DHURKADEVIBASKAR
 
Gas_Laws_powerpoint_notes.ppt for grade 10
Gas_Laws_powerpoint_notes.ppt for grade 10Gas_Laws_powerpoint_notes.ppt for grade 10
Gas_Laws_powerpoint_notes.ppt for grade 10ROLANARIBATO3
 
Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫qfactory1
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
TOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsTOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsssuserddc89b
 
Twin's paradox experiment is a meassurement of the extra dimensions.pptx
Twin's paradox experiment is a meassurement of the extra dimensions.pptxTwin's paradox experiment is a meassurement of the extra dimensions.pptx
Twin's paradox experiment is a meassurement of the extra dimensions.pptxEran Akiva Sinbar
 

Recently uploaded (20)

Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort ServiceHot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
 
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCRCall Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
Call Girls In Nihal Vihar Delhi ❤️8860477959 Looking Escorts In 24/7 Delhi NCR
 
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
 
Module 4: Mendelian Genetics and Punnett Square
Module 4:  Mendelian Genetics and Punnett SquareModule 4:  Mendelian Genetics and Punnett Square
Module 4: Mendelian Genetics and Punnett Square
 
‏‏VIRUS - 123455555555555555555555555555555555555555
‏‏VIRUS -  123455555555555555555555555555555555555555‏‏VIRUS -  123455555555555555555555555555555555555555
‏‏VIRUS - 123455555555555555555555555555555555555555
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
 
TOTAL CHOLESTEROL (lipid profile test).pptx
TOTAL CHOLESTEROL (lipid profile test).pptxTOTAL CHOLESTEROL (lipid profile test).pptx
TOTAL CHOLESTEROL (lipid profile test).pptx
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 tr
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptx
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentation
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
 
Evidences of Evolution General Biology 2
Evidences of Evolution General Biology 2Evidences of Evolution General Biology 2
Evidences of Evolution General Biology 2
 
Recombinant DNA technology( Transgenic plant and animal)
Recombinant DNA technology( Transgenic plant and animal)Recombinant DNA technology( Transgenic plant and animal)
Recombinant DNA technology( Transgenic plant and animal)
 
Gas_Laws_powerpoint_notes.ppt for grade 10
Gas_Laws_powerpoint_notes.ppt for grade 10Gas_Laws_powerpoint_notes.ppt for grade 10
Gas_Laws_powerpoint_notes.ppt for grade 10
 
Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
TOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsTOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physics
 
Twin's paradox experiment is a meassurement of the extra dimensions.pptx
Twin's paradox experiment is a meassurement of the extra dimensions.pptxTwin's paradox experiment is a meassurement of the extra dimensions.pptx
Twin's paradox experiment is a meassurement of the extra dimensions.pptx
 

Databases for Protein and Nucleic Acid Sequences

  • 1. Presented by – SWARUP MALAKAR
  • 2. A database is a repository of sequence ( DNA or amino acids ) stored in a computer which provide a centralized and homogenous view of its content. or, it is a vast collection of data pertaining to a specific topic, e.g., nucleotide sequence, protein sequence etc. Basically, it is an electronic environment. Databases are at the heart of bioinformatics.
  • 3. 1. Sequence databases: - that involves the sequences of both proteins and nucleic acids. 2. Structural databases:- that involves only protein databases. In additionally, it is also classified into three categories: A. Primary database B. Secondary databases C. Composite databases.
  • 4. It contain information of the sequence or structure alone either protein or nucleic acid . Example: PIR, SWISS-PROT for protein sequences , NCBI, EMBL and DDBJ for genome sequences.
  • 5. PIR: It is functionally annotated protein sequences and structure. PIR has collaborated with EBI and SIB to establish the UniProt ( United Protein Databases). The central resource of protein sequence and function.
  • 7. NCBI ( National Centre of Biotechnology Information ): - Nov 4, 1988 , the NCBI was established as division of the National Library of medicine for the development of information systems in molecular biology. - The NCBI is located in Bethesta, Maryland (U.S.A). - NCBI built the GenBank, which is an annotated collection of publically available nucleotide and protein sequences. - In 1988, the three partners (DDBJ, EMBL and GenBank) of the international Nucelotide Sequences Database collaboration had a meeting and agreed to use a common format.
  • 8. i. Maintains collaboration with several NIH institutes, academia, industry and other governmental agencies. ii. Develops, distributes, supports and coordinates access to a variety of databases and software for the scientific and medical communities. iii. Develops and promotes standards for databases, data deposition and exchange, and biological nomenclature. iv. Engages the members of the international scientific community in informatics research and training through the scientific visitors programs. Link: https://www.ncbi.nlm.nih.gov/
  • 9.  In 1992, NCBI has the responsibility for making available the DNA sequence database to the GenBank.  Coordinates with individual laboratories and other sequence data base such those of EMBL and DDBJ.  Moreover, NCBI has grown to provide other databases in addition to GenBank.  GenBank is a comprehensive sequence database that contains publicly available DNA sequences for more than 1,19,000 different organisms obtained through the submission of sequence data from individual lab and batch submissions from large-scale of seq. projects.  Daily data exchange with the EMBL data library in the UK and the DNA Data Bank of Japan helps world wide coverage.
  • 10.  Developed and maintained by European Molecular Biology Laboratory – European Bioinformatics Institute (EMBL-EBI).  Comprehensive data nucleotide sequence information.
  • 11.  The European Molecular Biology Laboratory (EMBL) Nucleotide Sequence Database is a comprehensive collection of primary nucleotide sequences maintained at the European Bioinformatics Institute (EBI).  Link: http:www.ebi.ac.uk/embl/ EMBL is supported by 22 member states, four prospect, and two associated states.  The laboratory operatory operates from five sites: the main laboratory in Heidelberg, and outstations Hinxton (EBI, in England), Grenoble (France), Hambury (Germany) and Manterotando ( near Rome).
  • 12.  EMBL groups and laboratories perform basic research in molecular biology and molecular medicine as well as training for science student and visitors.  Since 1982 this work has been done in collaboration with GenBank (NCBI, Bethesda, USA) and the DNA Database of Japan (Mishima).  For sequencing similar searching, a variety of tools (FASTA and BLAST are available that allow external users to compare their own seq. against the data in EMBL nucleotide sequence database and other database.
  • 13.  The DNA Data Bank of Japan (DDBJ) is a biological database that collects DNA sequences. It was established in 1986.  Link: https://www.ddbj.nig.ac.jp  It is located at the National Institute of Genetics (NIG) in the Shizuoka prefecture of Japan.  DDBJ is a member of the International Nucleotide Sequence Database Collaboration or INSDC.  It exchanges its data with European Molecular Biology Laboratory at the European Bioinformatics Institute and with GenBank at the National Center for Biotechnology Information on a daily basis.
  • 14.  DDBJ Center collects nucleotide sequence data as a member of INSDC(International Nucleotide Sequence Database Collaboration) and provides freely available nucleotide sequence data and supercomputer system, to support research activities in life science.  FEATURES  group 1: biological source of the sequence (source) The feature, “source” (group 1) is mandatory for all entries in the international nucleotide database. ...  group 2: biological function features of the region. ...  group 3: difference and/or change of the sequence data.
  • 15. Data type Organism Accession numbers for annotated sequences (number of entries) Accession numbers for raw reads Genome Radish (Raphanus sativus cv. Aokubi S- h) WGS: BAOO01000001- BAOO01072909 (72 909 entries) scaffold CON: DF196826- DF236948 (40,123 entries) DRR012610-DRR012624 Soybean (Glycine max cv. Enrei) BBNX02000001-BBNX02108601 (108 601 entries) DRR021740-DRR021744 Common marmoset (Callithrix jacchus) WGS: BBXK01000001- BBXK01109198 (109 198 entries) scaffold CON: DG000097- DG000120 (24 entries) GSS: LB274659-LB427105 (152 447 entries) DRR036754-DRR036764 List of notable data sets released from the DNA Data Bank of Japan (DDBJ) sequence databases from June 2015 to May 2016
  • 16.  Hosted at National Institute of Genetics .  Mainly from scientists in Japan and also from resources all over the world and shave this nucleotide data with EMBL and GenBank.  This officially , certified to collect nucleotide sequence from researchers sand to tissue the internationally recognized number of data submitters.  About 99% of the nucleotide data in INSDC are submitted by DDMJ  This database plays a major role to improve the quality of INSDC.  Each database entry include details of sequences, submitters details bibiliographic references, biological significance and the scientific name and taxonomy of the organism.
  • 17.  Features that identify coding regions transcription units, mutation sites etc. are displayed in a feature table. Major activities of the database.  Providing internationally recognized accession numbers to sequences.  Bioinformatics database management developing tools for the analysis and visualization of biological data.  Conducting courses for beginners to reduce the complexity in the biological data analysis.