SlideShare a Scribd company logo
1 of 8
GenBank (Genetic Sequence Databank)
Introduction:
 GenBank® is the genetic sequence database at the National Center for
Biotechnology Information (NCBI).
 It was established in the year 1982 and now maintained by the National Center
for Biotechnology (NCBI).
 DNA sequences can be submitted to GenBank using several different methods.
 It contains publicly available nucleotide sequences for more than 240 000 named
organisms, obtained primarily through submissions from individual laboratories
and batch submissions from large-scale sequencing projects.
 It has a flat file structure that is an ASCII text file, readable & downloadable by
both humans and computers.
 There are two main ways of making batch sequence submissions to GenBank:
NCBI’s Barcode Submission Tool (BarSTool) and Sequin.
 Entry data contains information on:
1. The sequence;
2. Accession numbers;
3. The scientific and gene names;
4. Taxonomy/phylogenetic classification of the source organism;
5. A feature that identifies coding regions;
6. References to published literature;
7. Transcription units
8. Mutation sites.
GenBank flat file Format
1. The LOCUS field: It consists of five different subfields, namely:
 1a Locus Name (e.g. HSHFE) - It is a tag for grouping similar sequences.
 The first two or three letters usually designate the organism.
 In this case HS stands for Homo sapiens. The last several characters are associated with another
group designation, such as gene product. In this example, the last three digits represent the gene
symbol, HFE.
 1b Sequence Length (12146 bp) – It is the total number of nucleotide base pairs (or amino acid
residues) in the sequence record.
 1c Molecule Type (e.g. DNA) - Type of molecule that was sequenced.
 1d GenBank Division (PRI) - GenBank has different divisions.
 In this example, PRI stands for primate sequences.
 Other divisions include ROD (rodent sequences), MAM (other mammal sequences), PLN (plant,
fungal, and algal sequences), & BCT (bacterial sequences).
2. 1e Modification Date (23-July-1999) - Date of most recent modification made to the record.
DEFINITION: – It is a brief description of the sequence.
 The description may include source organism name, gene or protein name, or designation as
untranscribed or untranslated sequences (e.g., a promoter region).
 For sequences containing a coding region (CDS), the definition field may also contain a
“completeness” qualifier such as "complete CDS" or "exon 1."
3. ACCESSION (Z92910): – It is a unique identifier assigned to a complete sequence record.
 This number never changes, even if the record is modified.
4. VERSION (Z92910.1) – It is an identification number assigned to a single, specific sequence in
the database.
 This number is in the format “accession.version.”
 If any changes are made to the sequence data, the version part of the number will increase by one.
 E.g. U12345.1 becomes U12345.2.
5. Gene Identifier (GI) (1890179) - Also a sequence identification number.
 Whenever a sequence is changed, the version number is increased and a new GI is assigned.
6. KEYWORDS (haemochromatosis; HFE gene) – A “keyword” can be “any word or phrase used
to describe the sequence”.
7. SOURCE (human) - Usually contains an abbreviated or common name of the source organism.
8. ORGANISM (Homo sapiens) - The scientific name (usually genus & species)
9. REFERENCE – It is a citation of publications by sequence authors that supports information
presented in the sequence record.
 Several references may be included in one record.
 References are automatically sorted from the oldest to the newest.
 Cited publications are searchable by author, article or publication title, journal title, or MEDLINE
unique identifier (UID).
10. . The FEATURES Table:
11. BASE COUNT & ORIGIN:
BASECOUNT - Base Count gives the total number of adenine (A), cytosine (C), guanine (G), and thymine
(T) bases in the sequence.
12. ORIGIN - Origin contains the sequence data, which begins on the line immediately below the field
title.
//
 Locus name helps in group entries with similar sequences. The first 3 characters denotes the organism, the
fourth and fifth characters gives other group designations, such as gene product and the last character is a
series of sequential integers.
 Sequence Length contains number of nucleotide base pairs (or amino acid residues) in the sequence
record.
 Molecule Type shows the type of sequenced molecule.
 Genbank Division shows the GenBank division to which a record belongs and is indicated by a three letter
abbreviation.
1. PRI - primate sequences
2. ROD - rodent sequences
3. MAM - other mammalian sequences
4. VRT - other vertebrate sequences
5. INV - invertebrate sequences
6. PLN - plant, fungal, and algal sequences
7. BCT - bacterial sequences
8. VRL - viral sequences
9. PHG - bacteriophage sequences
10. SYN - synthetic sequences
11. UNA - unannotated sequences
12. EST - EST sequences (expressed sequence tags)
13. PAT - patent sequences
14. STS - STS sequences (sequence tagged sites)
15. GSS - GSS sequences (genome survey sequences)
16. HTG - HTG sequences (high-throughput genomic seq)
17. HTC - unfinished high-throughput cDNA sequencing
18. ENV - environmental sampling sequences
 Modification Date shows the last date of modification.
 Definition is a brief description of sequence that includes information such as source organism, gene
name/protein name, or some description of the sequence's function.
 Accession number indicates the unique identifier for a sequence record.
 Records from the RefSeq
NT_123456 constructed genomic contigs
NM_123456 mRNAs
NP_123456 proteins
NC_123456 chromosomes
 Version shows a nucleotide sequence identification number that represents a single, specific sequence in
the GenBank database.
 GI "GenInfo Identifier" is a sequence identification number for the nucleotide sequence.
 Keywords describes word or phrase of the sequence.
 Source indicates free-format information including an abbreviated form of the organism name, sometimes
followed by a molecule type.
 Organism describes the formal scientific name for the source organism and its lineage.
 Reference includes publications by the authors of the sequence that discuss the data reported in the record.
 Authors contains List of authors in the order in which they appear in the cited article.
Entrez Search Field: Author [AUTH]
 Title represents the title of the published work or tentative title of an unpublished word.
Entrez Search Field: Text Word [WORD]
 Journal: MEDLINE abbreviation of the journal name.
Entrez Search Field: Journal Name [JOUR]
 Pubmed: PubMed Identifier (PMID)
 Features shows information about genes and gene products, as well as regions of biological significance
reported in the sequence.
 Source is a mandatory feature in each record that summarizes the length of the sequence, scientific name
of the source organism, and Taxon ID number. Can also include other information such as map location,
strain, clone, tissue type, etc., if provided by submitter.
 Taxon is a stable unique identification number for the taxon of the source organism.
 CDS (Coding sequence) represents region of nucleotides that corresponds with the sequence of amino
acids in a protein.
Gen bank (genetic sequence databank)

More Related Content

What's hot (20)

EMBL- European Molecular Biology Laboratory
EMBL- European Molecular Biology LaboratoryEMBL- European Molecular Biology Laboratory
EMBL- European Molecular Biology Laboratory
 
Nucleic acid and protein databanks
Nucleic acid and protein databanksNucleic acid and protein databanks
Nucleic acid and protein databanks
 
Data Retrieval Systems
Data Retrieval SystemsData Retrieval Systems
Data Retrieval Systems
 
Composite and Specialized databases
Composite and Specialized databasesComposite and Specialized databases
Composite and Specialized databases
 
Fasta
FastaFasta
Fasta
 
Uni prot presentation
Uni prot presentationUni prot presentation
Uni prot presentation
 
Scop database
Scop databaseScop database
Scop database
 
Proteins databases
Proteins databasesProteins databases
Proteins databases
 
Primary and secondary databases ppt by puneet kulyana
Primary and secondary databases ppt by puneet kulyanaPrimary and secondary databases ppt by puneet kulyana
Primary and secondary databases ppt by puneet kulyana
 
Swiss PROT
Swiss PROT Swiss PROT
Swiss PROT
 
DNA data bank of japan (DDBJ)
DNA data bank of japan (DDBJ)DNA data bank of japan (DDBJ)
DNA data bank of japan (DDBJ)
 
UniProt
UniProtUniProt
UniProt
 
Kegg databse
Kegg databseKegg databse
Kegg databse
 
Primary, secondary, tertiary biological database
Primary, secondary, tertiary biological databasePrimary, secondary, tertiary biological database
Primary, secondary, tertiary biological database
 
Biological databases
Biological databasesBiological databases
Biological databases
 
blast bioinformatics
blast bioinformaticsblast bioinformatics
blast bioinformatics
 
NCBI
NCBINCBI
NCBI
 
Ddbj
DdbjDdbj
Ddbj
 
Nucleic Acid Sequence databases
Nucleic Acid Sequence databasesNucleic Acid Sequence databases
Nucleic Acid Sequence databases
 
Gene bank by kk sahu
Gene bank by kk sahuGene bank by kk sahu
Gene bank by kk sahu
 

Viewers also liked

Presentation at the EMBL-EBI Industry RDF meeting
Presentation at the EMBL-EBI  Industry RDF meetingPresentation at the EMBL-EBI  Industry RDF meeting
Presentation at the EMBL-EBI Industry RDF meetingJohannes Keizer
 
European Molecular Biology Laboratory (EMBL)- European Bioinformatics Institu...
European Molecular Biology Laboratory (EMBL)- European Bioinformatics Institu...European Molecular Biology Laboratory (EMBL)- European Bioinformatics Institu...
European Molecular Biology Laboratory (EMBL)- European Bioinformatics Institu...ExternalEvents
 
Trans 2 butene
Trans 2 buteneTrans 2 butene
Trans 2 buteneimanijc
 
Unit 2 8 Alcohols And Halogenoalkanes Notes
Unit 2 8 Alcohols And Halogenoalkanes NotesUnit 2 8 Alcohols And Halogenoalkanes Notes
Unit 2 8 Alcohols And Halogenoalkanes NotesM F Ebden
 
rules in naming organic compound
rules in naming organic compoundrules in naming organic compound
rules in naming organic compoundvxiiayah
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomicsprateek kumar
 
FERMENTATIONS , PHOTOSYNTHESIS & NITROGEN FIXATION
FERMENTATIONS , PHOTOSYNTHESIS & NITROGEN FIXATION FERMENTATIONS , PHOTOSYNTHESIS & NITROGEN FIXATION
FERMENTATIONS , PHOTOSYNTHESIS & NITROGEN FIXATION Vidya Kalaivani Rajkumar
 

Viewers also liked (20)

Gene bank
Gene bankGene bank
Gene bank
 
Group discussion
Group discussionGroup discussion
Group discussion
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Retrieval and Statistical Analysis of Genbank Data (RASA-GD)
Retrieval and Statistical Analysis of Genbank Data (RASA-GD)Retrieval and Statistical Analysis of Genbank Data (RASA-GD)
Retrieval and Statistical Analysis of Genbank Data (RASA-GD)
 
Presentation at the EMBL-EBI Industry RDF meeting
Presentation at the EMBL-EBI  Industry RDF meetingPresentation at the EMBL-EBI  Industry RDF meeting
Presentation at the EMBL-EBI Industry RDF meeting
 
Ozonolysis of 2-Butenes
Ozonolysis of 2-ButenesOzonolysis of 2-Butenes
Ozonolysis of 2-Butenes
 
European Molecular Biology Laboratory (EMBL)- European Bioinformatics Institu...
European Molecular Biology Laboratory (EMBL)- European Bioinformatics Institu...European Molecular Biology Laboratory (EMBL)- European Bioinformatics Institu...
European Molecular Biology Laboratory (EMBL)- European Bioinformatics Institu...
 
Nomenclature
NomenclatureNomenclature
Nomenclature
 
Trans 2 butene
Trans 2 buteneTrans 2 butene
Trans 2 butene
 
Unit 2 8 Alcohols And Halogenoalkanes Notes
Unit 2 8 Alcohols And Halogenoalkanes NotesUnit 2 8 Alcohols And Halogenoalkanes Notes
Unit 2 8 Alcohols And Halogenoalkanes Notes
 
rules in naming organic compound
rules in naming organic compoundrules in naming organic compound
rules in naming organic compound
 
Locus link
Locus linkLocus link
Locus link
 
Table manners
Table mannersTable manners
Table manners
 
Protein sequence databases
Protein sequence databasesProtein sequence databases
Protein sequence databases
 
Negotiation skill
Negotiation skillNegotiation skill
Negotiation skill
 
Food chain
Food chainFood chain
Food chain
 
Community ecology
Community ecologyCommunity ecology
Community ecology
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
 
FERMENTATIONS , PHOTOSYNTHESIS & NITROGEN FIXATION
FERMENTATIONS , PHOTOSYNTHESIS & NITROGEN FIXATION FERMENTATIONS , PHOTOSYNTHESIS & NITROGEN FIXATION
FERMENTATIONS , PHOTOSYNTHESIS & NITROGEN FIXATION
 
Bioinformatics assignment
Bioinformatics assignmentBioinformatics assignment
Bioinformatics assignment
 

Similar to Gen bank (genetic sequence databank)

Database in bioinformatics
Database in bioinformaticsDatabase in bioinformatics
Database in bioinformaticsVinaKhan1
 
Bioinformatics final
Bioinformatics finalBioinformatics final
Bioinformatics finalRainu Rajeev
 
NCBI Boot Camp for Beginners Slides
NCBI Boot Camp for Beginners SlidesNCBI Boot Camp for Beginners Slides
NCBI Boot Camp for Beginners SlidesJackie Wirz, PhD
 
02. Biological sequence databases.pptx
02. Biological sequence databases.pptx02. Biological sequence databases.pptx
02. Biological sequence databases.pptxHussainTaqi1
 
Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...SBituila
 
Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...BibiQuinah
 
Role of bioinformatics in life sciences research
Role of bioinformatics in life sciences researchRole of bioinformatics in life sciences research
Role of bioinformatics in life sciences researchAnshika Bansal
 
BIOLOGICAL SEQUENCE DATABASES
BIOLOGICAL SEQUENCE DATABASES BIOLOGICAL SEQUENCE DATABASES
BIOLOGICAL SEQUENCE DATABASES nadeem akhter
 
The uni prot knowledgebase
The uni prot knowledgebaseThe uni prot knowledgebase
The uni prot knowledgebaseKew Sama
 

Similar to Gen bank (genetic sequence databank) (20)

Major biological nucleotide databases
Major biological nucleotide databasesMajor biological nucleotide databases
Major biological nucleotide databases
 
Database in bioinformatics
Database in bioinformaticsDatabase in bioinformatics
Database in bioinformatics
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Biological Databases
Biological DatabasesBiological Databases
Biological Databases
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Bioinformatics final
Bioinformatics finalBioinformatics final
Bioinformatics final
 
Dn abarcode
Dn abarcodeDn abarcode
Dn abarcode
 
NCBI Boot Camp for Beginners Slides
NCBI Boot Camp for Beginners SlidesNCBI Boot Camp for Beginners Slides
NCBI Boot Camp for Beginners Slides
 
02. Biological sequence databases.pptx
02. Biological sequence databases.pptx02. Biological sequence databases.pptx
02. Biological sequence databases.pptx
 
Protein databases
Protein databasesProtein databases
Protein databases
 
Intro to databases
Intro to databasesIntro to databases
Intro to databases
 
Asnmnt 4
Asnmnt 4Asnmnt 4
Asnmnt 4
 
Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...
 
Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...
 
Role of bioinformatics in life sciences research
Role of bioinformatics in life sciences researchRole of bioinformatics in life sciences research
Role of bioinformatics in life sciences research
 
BIOLOGICAL SEQUENCE DATABASES
BIOLOGICAL SEQUENCE DATABASES BIOLOGICAL SEQUENCE DATABASES
BIOLOGICAL SEQUENCE DATABASES
 
Biological database
Biological databaseBiological database
Biological database
 
2016 02 23_biological_databases_part1
2016 02 23_biological_databases_part12016 02 23_biological_databases_part1
2016 02 23_biological_databases_part1
 
Introduction to Biological databases
Introduction to Biological databasesIntroduction to Biological databases
Introduction to Biological databases
 
The uni prot knowledgebase
The uni prot knowledgebaseThe uni prot knowledgebase
The uni prot knowledgebase
 

More from Vidya Kalaivani Rajkumar

Transgenic plants- Abiotic stress tolerance
Transgenic plants- Abiotic stress toleranceTransgenic plants- Abiotic stress tolerance
Transgenic plants- Abiotic stress toleranceVidya Kalaivani Rajkumar
 
Protein structure visualization tools-RASMOL
Protein structure visualization tools-RASMOLProtein structure visualization tools-RASMOL
Protein structure visualization tools-RASMOLVidya Kalaivani Rajkumar
 
Protein structure visualisation tools-RasMol
Protein structure visualisation tools-RasMolProtein structure visualisation tools-RasMol
Protein structure visualisation tools-RasMolVidya Kalaivani Rajkumar
 

More from Vidya Kalaivani Rajkumar (20)

Recombinant vaccines-Peptide Vaccines
Recombinant vaccines-Peptide Vaccines Recombinant vaccines-Peptide Vaccines
Recombinant vaccines-Peptide Vaccines
 
Transgenic plants- Abiotic stress tolerance
Transgenic plants- Abiotic stress toleranceTransgenic plants- Abiotic stress tolerance
Transgenic plants- Abiotic stress tolerance
 
Bioreactors in tissue engineering
Bioreactors in tissue engineeringBioreactors in tissue engineering
Bioreactors in tissue engineering
 
Tissue assembly in microgravity
Tissue assembly in microgravityTissue assembly in microgravity
Tissue assembly in microgravity
 
In vivo synthesis of tissues and organs
In vivo synthesis of tissues and organsIn vivo synthesis of tissues and organs
In vivo synthesis of tissues and organs
 
Bioartificial pancreas
Bioartificial pancreasBioartificial pancreas
Bioartificial pancreas
 
Biomaterials for tissue engineering
Biomaterials for tissue engineeringBiomaterials for tissue engineering
Biomaterials for tissue engineering
 
Haematopoietic system
Haematopoietic systemHaematopoietic system
Haematopoietic system
 
Fasta
FastaFasta
Fasta
 
Water vascular system of star fish
Water vascular system of star fishWater vascular system of star fish
Water vascular system of star fish
 
Cephalopodes are advance molluscs
Cephalopodes are advance molluscsCephalopodes are advance molluscs
Cephalopodes are advance molluscs
 
Beat air pollution
Beat air pollution Beat air pollution
Beat air pollution
 
Birth control methods
Birth control methodsBirth control methods
Birth control methods
 
Future of human evolution
Future of human evolutionFuture of human evolution
Future of human evolution
 
Sequence alignment
Sequence alignmentSequence alignment
Sequence alignment
 
Assignment on developmental zoology
Assignment on developmental zoologyAssignment on developmental zoology
Assignment on developmental zoology
 
Development of chick
Development of chickDevelopment of chick
Development of chick
 
Protein structure visualization tools-RASMOL
Protein structure visualization tools-RASMOLProtein structure visualization tools-RASMOL
Protein structure visualization tools-RASMOL
 
Swiss pdb viewer
Swiss pdb viewerSwiss pdb viewer
Swiss pdb viewer
 
Protein structure visualisation tools-RasMol
Protein structure visualisation tools-RasMolProtein structure visualisation tools-RasMol
Protein structure visualisation tools-RasMol
 

Recently uploaded

Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxSwapnil Therkar
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxpradhanghanshyam7136
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfNAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfWadeK3
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PPRINCE C P
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PPRINCE C P
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 

Recently uploaded (20)

Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
 
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptx
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdfNAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
NAVSEA PEO USC - Unmanned & Small Combatants 26Oct23.pdf
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
Artificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C PArtificial Intelligence In Microbiology by Dr. Prince C P
Artificial Intelligence In Microbiology by Dr. Prince C P
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 

Gen bank (genetic sequence databank)

  • 1. GenBank (Genetic Sequence Databank) Introduction:  GenBank® is the genetic sequence database at the National Center for Biotechnology Information (NCBI).  It was established in the year 1982 and now maintained by the National Center for Biotechnology (NCBI).  DNA sequences can be submitted to GenBank using several different methods.  It contains publicly available nucleotide sequences for more than 240 000 named organisms, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects.  It has a flat file structure that is an ASCII text file, readable & downloadable by both humans and computers.  There are two main ways of making batch sequence submissions to GenBank: NCBI’s Barcode Submission Tool (BarSTool) and Sequin.
  • 2.  Entry data contains information on: 1. The sequence; 2. Accession numbers; 3. The scientific and gene names; 4. Taxonomy/phylogenetic classification of the source organism; 5. A feature that identifies coding regions; 6. References to published literature; 7. Transcription units 8. Mutation sites. GenBank flat file Format 1. The LOCUS field: It consists of five different subfields, namely:  1a Locus Name (e.g. HSHFE) - It is a tag for grouping similar sequences.  The first two or three letters usually designate the organism.
  • 3.  In this case HS stands for Homo sapiens. The last several characters are associated with another group designation, such as gene product. In this example, the last three digits represent the gene symbol, HFE.  1b Sequence Length (12146 bp) – It is the total number of nucleotide base pairs (or amino acid residues) in the sequence record.  1c Molecule Type (e.g. DNA) - Type of molecule that was sequenced.  1d GenBank Division (PRI) - GenBank has different divisions.  In this example, PRI stands for primate sequences.  Other divisions include ROD (rodent sequences), MAM (other mammal sequences), PLN (plant, fungal, and algal sequences), & BCT (bacterial sequences). 2. 1e Modification Date (23-July-1999) - Date of most recent modification made to the record. DEFINITION: – It is a brief description of the sequence.  The description may include source organism name, gene or protein name, or designation as untranscribed or untranslated sequences (e.g., a promoter region).  For sequences containing a coding region (CDS), the definition field may also contain a “completeness” qualifier such as "complete CDS" or "exon 1." 3. ACCESSION (Z92910): – It is a unique identifier assigned to a complete sequence record.  This number never changes, even if the record is modified. 4. VERSION (Z92910.1) – It is an identification number assigned to a single, specific sequence in the database.  This number is in the format “accession.version.”  If any changes are made to the sequence data, the version part of the number will increase by one.  E.g. U12345.1 becomes U12345.2. 5. Gene Identifier (GI) (1890179) - Also a sequence identification number.  Whenever a sequence is changed, the version number is increased and a new GI is assigned. 6. KEYWORDS (haemochromatosis; HFE gene) – A “keyword” can be “any word or phrase used to describe the sequence”. 7. SOURCE (human) - Usually contains an abbreviated or common name of the source organism.
  • 4. 8. ORGANISM (Homo sapiens) - The scientific name (usually genus & species) 9. REFERENCE – It is a citation of publications by sequence authors that supports information presented in the sequence record.  Several references may be included in one record.  References are automatically sorted from the oldest to the newest.  Cited publications are searchable by author, article or publication title, journal title, or MEDLINE unique identifier (UID). 10. . The FEATURES Table:
  • 5. 11. BASE COUNT & ORIGIN: BASECOUNT - Base Count gives the total number of adenine (A), cytosine (C), guanine (G), and thymine (T) bases in the sequence. 12. ORIGIN - Origin contains the sequence data, which begins on the line immediately below the field title.
  • 6. //  Locus name helps in group entries with similar sequences. The first 3 characters denotes the organism, the fourth and fifth characters gives other group designations, such as gene product and the last character is a series of sequential integers.  Sequence Length contains number of nucleotide base pairs (or amino acid residues) in the sequence record.  Molecule Type shows the type of sequenced molecule.  Genbank Division shows the GenBank division to which a record belongs and is indicated by a three letter abbreviation. 1. PRI - primate sequences 2. ROD - rodent sequences 3. MAM - other mammalian sequences 4. VRT - other vertebrate sequences 5. INV - invertebrate sequences 6. PLN - plant, fungal, and algal sequences 7. BCT - bacterial sequences 8. VRL - viral sequences 9. PHG - bacteriophage sequences 10. SYN - synthetic sequences 11. UNA - unannotated sequences 12. EST - EST sequences (expressed sequence tags) 13. PAT - patent sequences 14. STS - STS sequences (sequence tagged sites) 15. GSS - GSS sequences (genome survey sequences) 16. HTG - HTG sequences (high-throughput genomic seq) 17. HTC - unfinished high-throughput cDNA sequencing 18. ENV - environmental sampling sequences  Modification Date shows the last date of modification.  Definition is a brief description of sequence that includes information such as source organism, gene name/protein name, or some description of the sequence's function.  Accession number indicates the unique identifier for a sequence record.
  • 7.  Records from the RefSeq NT_123456 constructed genomic contigs NM_123456 mRNAs NP_123456 proteins NC_123456 chromosomes  Version shows a nucleotide sequence identification number that represents a single, specific sequence in the GenBank database.  GI "GenInfo Identifier" is a sequence identification number for the nucleotide sequence.  Keywords describes word or phrase of the sequence.  Source indicates free-format information including an abbreviated form of the organism name, sometimes followed by a molecule type.  Organism describes the formal scientific name for the source organism and its lineage.  Reference includes publications by the authors of the sequence that discuss the data reported in the record.  Authors contains List of authors in the order in which they appear in the cited article. Entrez Search Field: Author [AUTH]  Title represents the title of the published work or tentative title of an unpublished word. Entrez Search Field: Text Word [WORD]  Journal: MEDLINE abbreviation of the journal name. Entrez Search Field: Journal Name [JOUR]  Pubmed: PubMed Identifier (PMID)  Features shows information about genes and gene products, as well as regions of biological significance reported in the sequence.  Source is a mandatory feature in each record that summarizes the length of the sequence, scientific name of the source organism, and Taxon ID number. Can also include other information such as map location, strain, clone, tissue type, etc., if provided by submitter.  Taxon is a stable unique identification number for the taxon of the source organism.  CDS (Coding sequence) represents region of nucleotides that corresponds with the sequence of amino acids in a protein.