SlideShare a Scribd company logo
NCBI
National Centre For Biotechnology
Information
Site: www.ncbi.nlm.nih.gov
By Richa Sharma
M.Sc. Biomedical Sciences
Dr. BR Ambedkar Center for Biomedical
aresearch (ACBR)
INTRODUCTION
NCBI was established in the year 1988, as a part of the
National Library of Medicine at the National Institutes of
Health, Maryland, USA
NCBI HOME PAGE
DIFFERENCES BETWEEN
DATABASE AND TOOL
DATABASE
 It is a collection of data
that is structured,
searchable, updated
periodically and cross-
referenced.
 Different databases are:
 Genome Database
 Sequence Database
 Protein Database
 Literature Database
 Disease Database
TOOL
 A program that is used to
extract or retrieve the
desired information from
the database.
 Different types of tools are:
 Database Retrieval Tool i.e.
Entrez
 BLAST
 ORF Finder
 ePCR
 Spidey
DATABASES AND TOOLS OF NCBI
TOOLS OF NCBI
DATABASE RETRIEVAL TOOL-
ENTREZ
Entrez is an integrated database search and retrieval
system that extracts information from DNA and protein
sequence data, population sets, whole genome,
macromolecular structures, and the biomedical literature
via PubMed.
Entrez provides extensive links within and between
database records.
http://www.ncbi.nlm.nih.gov/gquery/
ARCHITECTURE OF THE ENTREZ SYSTEM
BLAST-BASIC LOCAL ALIGNMENT
SEARCH TOOL
The BLAST programs perform sequence-similarity searches
against a variety of sequence databases, returning a set of
gapped alignments with links to full database records, to
UniGene, Gene, the MMDB, or GEO.
The BLAST tools available at NCBI are classified into
different categories.
Two important ones are:
 Standard BLAST
 MegaBLAST
STANDARD BLAST
Standard BLAST includes:
 blastn : Comparing the nucleotide sequence query
against a nucleotide sequence database.
 blastp : Comparing the amino acid query against a
protein sequence database.
 blastx : Comparing the nucleotide query sequence
translated in all reading frames against a protein
database.
• tblastn : Comparing the protein query
sequence against a nucleotide database
translated in all reading frames.
tblastx : Comparing the six –reading
frame translations of the nucleotide
query against six frame translations of
the nucleotide sequence database.
MegaBLAST
MegaBLAST is a program optimized for aligning long
sequences.
It can only work with DNA sequences, hence the only
program it supports is “blastn”.
It is faster than blastn but less sensitive,
SEQUENCE SUBMISSION TO NCBI
The databases are constantly updated through newer
submissions of sequences, and this is done using the
following sequence submission tools :
1. BankIt
2. Sequin
BankIt
BankIT is a web based GenBank sequence submission tool.
It is a tool of choice for simple submissions, especially
when only one or small number of records are to be
submitted. It can also be used by submitters to update
their existing GenBank records. Sequence analysis tools are
not required for submission through this process.
SEQUIN
Sequin is a stand-alone software tool developed by NCBI
which aids in submission and updating entries to the
sequence databases. It helps in handling multiple
sequence submissions, provides increased capacity for
complex submissions containing long sequences, multiple
annotations, segmented sets of DNA or phylogenetic and
population studies.
It also provides graphical viewing and editing options.
NCBI HOME PAGE
SPECIALISED TOOLS
Some of the specialized tools for the sequence analysis are
:
1. ORF Finder
2. e-PCR
3. Spidey
Open Reading Frame (ORF)
Finder
ORF Finder is an essential graphical analysis tool, which
finds all open reading frames of a selectable minimum size
in a user’s sequence or in a sequence already in the
database.
It uses the standard or alternative genetic codes to identify
all open reading frames.
This is helpful in preparing complete and accurate
sequence submissions. It is also packaged with the Sequin
sequence submission software.
e-PCR (Electronic Polymerase
Chain Reaction)
e-PCR is a computational procedure that is used to identify
sequence-tagged sites (STSs) within DNA sequeces. While
looking for potential STSs in DNA sequences e-PCR searches
for sub-sequences that closely match the PCR primers and
have the correct order, orientation, and spacing that could
represent the PCR primers used to generate known
STSs.The new version of e-PCr provides a search mode
using a query sequence against a sequence database.
SPIDEY
This is an m-RNA to genomic alignment program ,which
uses the local alignment tools like BLAST to find its
alignment. Spidey takes as an input a single genomic
sequence and a set of mRNA-FASTA sequences. At first,
Spidey defines windows on the genomic sequence and then
perform the mRNA-to-genomic alignment separately within
each window to avoid including exons from paralogs and
pseudogenes. It has no maximum intron size and does not
favour shorter or longer introns.
Databases
 Structured collection of information.
 Consists of basic units called record or enteries.
 The prefect database-
 Comprehensive but easy to search
 Cross referenced
 Minimum redundancy
NCBI Databases
 Nucleotide database
 Literature database
 Protein database
 Gene expression database
 Structural database
 Chemical database
 Other databases
Kinds of databases
Primary database
 Original submissions by
experimentalists.
 Database staff organise
but don’t add additional
information.
 Example - Genbank
Derivative databases
 Derived from primary
data
 Content controlled by
third party.
 Examples – Refseq,
SWISS-PROT, unigene
Nucleotide database
 GENBANK
 NCBI’s primary sequence data
 It is a comprehensive public database of nucleotide
sequences.
 Genbank along with EMBL and DDBJ comprises the INSD.
 It is a collaborative approach for exchanging data daily
to ensure a uniform and comprehensive collection of
sequence information.
Accession numbers are labels for
sequences
 DNA sequences and other molecular data are tagged with
accession numbers that are used to identify a sequence or
other record relevant to molecular data.
 It is string of letters and/or numbers that corresponds to a
molecular sequence.
 It is shared among the 3 collaborating databases and
remains constant over the lifetime of record.
 The DNA sequence within a Genbank record is also assigned
a unique NCBI identifier called a ‘gi’ that apperas on the
version line of flat file records following the accession
number.
Retrieval of nucleotide sequence of
beta-globin gene from Xenopus laevis
NCBI’s Derivative Sequence
Database
 RefSeq
 It is a collection of non redundant set of nucleotide and
protein sequences.
 It is derived from the primary submissions available in the
GenBank.
 RefSeq records can be distinguished from GenBank records
by the format of the accession series
 RefSeq accession numbers are formatted as two alphabetic
characters followed by an underscore ‘-’
 The GenBank accession never include an underscore.
Literature database
 PMC – PubMed Central
 It is a digital archive of peer-reviewed journals in the
life sciences providing access to full-text articles.
 All PMC free articles are identified in PubMed search
results and PMC itself can be searched using Entrez.
Retrieval of complete entry of role of
remorin protein in the pubmed
database
Protein database
 Entrez protein is the protein sequence database of NCBI.
 The protein sequences in this database come from several
different sources such as Swiss-Prot,PDB.
 There are GenPept translations for each of the coding
sequences within the GenBank nucleotide database.
 The Entrez protein database is cross linked to the Entrez
taxonomy database.
 It is also linled to CDD.
 After clicking on the individual search results of Entrez
protein,the protein sequence is displayed in a particular
format which is known as GenPept.
Expression database
 GEO-Gene Expression Omnibus
 Distribution and regulation of the transcriptional
products of normal and abnormal cell types.
 SAGE map- serial analysis of gene expression map.
Structural database
 MMDB-Molecular modelling database.
 3D macromolecular structures.
 XRD and NMR are being used for the experimental structure
determination.
 These provide a wealth of information regarding the biological
function,mechanism linked to the function,the evolutionary history of the
function and relationship between the macromolecules.
Chemical database
 PubChem is a database of chemical molecules
maintained by NCBI.
 It focuses on the chemical,structural and biological
properties of small molecules
 Molecular mass below 2000u.
Other databases
 OMIM-Online Mendelian Inheritance in Man.
 It is a comprehensive,authoritative and timely
knowledge base of human genes and genetic disorders.
 OMIA-Online Mendelian Inheritance in Animals.
 It is a database of genes,inhertited disorders and traits
in animal species other than human and mouse.
THANK
YOU… !!! 

More Related Content

What's hot

Sequence analysis - Bioinformatics
Sequence analysis - BioinformaticsSequence analysis - Bioinformatics
Sequence analysis - Bioinformatics
Pratik Parikh
 
Protein structure classification/domain prediction: SCOP and CATH (Bioinforma...
Protein structure classification/domain prediction: SCOP and CATH (Bioinforma...Protein structure classification/domain prediction: SCOP and CATH (Bioinforma...
Protein structure classification/domain prediction: SCOP and CATH (Bioinforma...
SELF-EXPLANATORY
 
Biological databases
Biological databasesBiological databases
Biological databases
Malla Reddy College of Pharmacy
 
(Expasy)
(Expasy)(Expasy)
(Expasy)
Mazhar Khan
 
NCBI
NCBINCBI
Gen bank (genetic sequence databank)
Gen bank (genetic sequence databank)Gen bank (genetic sequence databank)
Gen bank (genetic sequence databank)
Vidya Kalaivani Rajkumar
 
Genomic databases
Genomic databasesGenomic databases
NCBI National Center for Biotechnology Information
NCBI National Center for Biotechnology InformationNCBI National Center for Biotechnology Information
NCBI National Center for Biotechnology Information
Thapar Institute of Engineering & Technology, Patiala, Punjab, India
 
Protein data bank
Protein data bankProtein data bank
Protein data bank
Yogesh Joshi
 
Uni prot presentation
Uni prot presentationUni prot presentation
Uni prot presentation
Rida Khalid
 
Blast
BlastBlast
Protein databases
Protein databasesProtein databases
Protein databasessarumalay
 
Gen bank databases
Gen bank databasesGen bank databases
Gen bank databases
Hafiz Muhammad Zeeshan Raza
 
Scop database
Scop databaseScop database
Scop database
Sayantani Roy
 
DNA data bank of japan (DDBJ)
DNA data bank of japan (DDBJ)DNA data bank of japan (DDBJ)
DNA data bank of japan (DDBJ)
ZoufishanY
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
Subhranil Bhattacharjee
 
Introduction to databases.pptx
Introduction to databases.pptxIntroduction to databases.pptx
Introduction to databases.pptx
sworna kumari chithiraivelu
 
sequence of file formats in bioinformatics
sequence of file formats in bioinformaticssequence of file formats in bioinformatics
sequence of file formats in bioinformatics
nadeem akhter
 
Gen bank
Gen bankGen bank
Biological data base
Biological data baseBiological data base
Biological data base
kishoreGupta17
 

What's hot (20)

Sequence analysis - Bioinformatics
Sequence analysis - BioinformaticsSequence analysis - Bioinformatics
Sequence analysis - Bioinformatics
 
Protein structure classification/domain prediction: SCOP and CATH (Bioinforma...
Protein structure classification/domain prediction: SCOP and CATH (Bioinforma...Protein structure classification/domain prediction: SCOP and CATH (Bioinforma...
Protein structure classification/domain prediction: SCOP and CATH (Bioinforma...
 
Biological databases
Biological databasesBiological databases
Biological databases
 
(Expasy)
(Expasy)(Expasy)
(Expasy)
 
NCBI
NCBINCBI
NCBI
 
Gen bank (genetic sequence databank)
Gen bank (genetic sequence databank)Gen bank (genetic sequence databank)
Gen bank (genetic sequence databank)
 
Genomic databases
Genomic databasesGenomic databases
Genomic databases
 
NCBI National Center for Biotechnology Information
NCBI National Center for Biotechnology InformationNCBI National Center for Biotechnology Information
NCBI National Center for Biotechnology Information
 
Protein data bank
Protein data bankProtein data bank
Protein data bank
 
Uni prot presentation
Uni prot presentationUni prot presentation
Uni prot presentation
 
Blast
BlastBlast
Blast
 
Protein databases
Protein databasesProtein databases
Protein databases
 
Gen bank databases
Gen bank databasesGen bank databases
Gen bank databases
 
Scop database
Scop databaseScop database
Scop database
 
DNA data bank of japan (DDBJ)
DNA data bank of japan (DDBJ)DNA data bank of japan (DDBJ)
DNA data bank of japan (DDBJ)
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
 
Introduction to databases.pptx
Introduction to databases.pptxIntroduction to databases.pptx
Introduction to databases.pptx
 
sequence of file formats in bioinformatics
sequence of file formats in bioinformaticssequence of file formats in bioinformatics
sequence of file formats in bioinformatics
 
Gen bank
Gen bankGen bank
Gen bank
 
Biological data base
Biological data baseBiological data base
Biological data base
 

Similar to Ncbi

Bioinformatic, and tools by kk sahu
Bioinformatic, and tools by kk sahuBioinformatic, and tools by kk sahu
Bioinformatic, and tools by kk sahu
KAUSHAL SAHU
 
Informal presentation on bioinformatics
Informal presentation on bioinformaticsInformal presentation on bioinformatics
Informal presentation on bioinformaticsAtai Rabby
 
Article
ArticleArticle
Article
MisbahAlwi
 
Primary sequencing of nucleic acids
Primary sequencing of nucleic acidsPrimary sequencing of nucleic acids
Primary sequencing of nucleic acids
vibhakumari12
 
Understanding Genome
Understanding Genome Understanding Genome
Understanding Genome
Rajendra K Labala
 
Sequencedatabases
SequencedatabasesSequencedatabases
SequencedatabasesAbhik Seal
 
DATABASES...............................pptx
DATABASES...............................pptxDATABASES...............................pptx
DATABASES...............................pptx
Cherry
 
BioInformatics Tools -Genomics , Proteomics and metablomics
BioInformatics Tools -Genomics , Proteomics and metablomicsBioInformatics Tools -Genomics , Proteomics and metablomics
BioInformatics Tools -Genomics , Proteomics and metablomics
AyeshaYousaf20
 
Protein databases
Protein databasesProtein databases
Protein databases
bansalaman80
 
Databases_L2.pptx
Databases_L2.pptxDatabases_L2.pptx
Databases_L2.pptx
kigaruantony
 
BIOLOGICAL SEQUENCE DATABASES
BIOLOGICAL SEQUENCE DATABASES BIOLOGICAL SEQUENCE DATABASES
BIOLOGICAL SEQUENCE DATABASES nadeem akhter
 
database retrival.pdf
database retrival.pdfdatabase retrival.pdf
database retrival.pdf
SrimathideviJ
 
Introduction to Biological database ppt(1).pptx
Introduction to Biological database ppt(1).pptxIntroduction to Biological database ppt(1).pptx
Introduction to Biological database ppt(1).pptx
RAJESHKUMAR428748
 
JEVBase: An Interactive Resource for Protein Annotationof JE Virus
JEVBase: An Interactive Resource for Protein Annotationof JE VirusJEVBase: An Interactive Resource for Protein Annotationof JE Virus
JEVBase: An Interactive Resource for Protein Annotationof JE Virus
CSCJournals
 
Presentation on Biological database By Elufer Akram @ University Of Science ...
Presentation on Biological database  By Elufer Akram @ University Of Science ...Presentation on Biological database  By Elufer Akram @ University Of Science ...
Presentation on Biological database By Elufer Akram @ University Of Science ...
Elufer Akram
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
Raj Varun
 
02. Biological sequence databases.pptx
02. Biological sequence databases.pptx02. Biological sequence databases.pptx
02. Biological sequence databases.pptx
HussainTaqi1
 
Bioinformatics data mining
Bioinformatics data miningBioinformatics data mining
Bioinformatics data mining
Sangeeta Das
 

Similar to Ncbi (20)

Bioinformatic, and tools by kk sahu
Bioinformatic, and tools by kk sahuBioinformatic, and tools by kk sahu
Bioinformatic, and tools by kk sahu
 
Informal presentation on bioinformatics
Informal presentation on bioinformaticsInformal presentation on bioinformatics
Informal presentation on bioinformatics
 
Article
ArticleArticle
Article
 
Primary sequencing of nucleic acids
Primary sequencing of nucleic acidsPrimary sequencing of nucleic acids
Primary sequencing of nucleic acids
 
Understanding Genome
Understanding Genome Understanding Genome
Understanding Genome
 
blast bioinformatics
blast bioinformaticsblast bioinformatics
blast bioinformatics
 
Sequencedatabases
SequencedatabasesSequencedatabases
Sequencedatabases
 
DATABASES...............................pptx
DATABASES...............................pptxDATABASES...............................pptx
DATABASES...............................pptx
 
BioInformatics Tools -Genomics , Proteomics and metablomics
BioInformatics Tools -Genomics , Proteomics and metablomicsBioInformatics Tools -Genomics , Proteomics and metablomics
BioInformatics Tools -Genomics , Proteomics and metablomics
 
Protein databases
Protein databasesProtein databases
Protein databases
 
Databases_L2.pptx
Databases_L2.pptxDatabases_L2.pptx
Databases_L2.pptx
 
BIOLOGICAL SEQUENCE DATABASES
BIOLOGICAL SEQUENCE DATABASES BIOLOGICAL SEQUENCE DATABASES
BIOLOGICAL SEQUENCE DATABASES
 
database retrival.pdf
database retrival.pdfdatabase retrival.pdf
database retrival.pdf
 
Introduction to Biological database ppt(1).pptx
Introduction to Biological database ppt(1).pptxIntroduction to Biological database ppt(1).pptx
Introduction to Biological database ppt(1).pptx
 
JEVBase: An Interactive Resource for Protein Annotationof JE Virus
JEVBase: An Interactive Resource for Protein Annotationof JE VirusJEVBase: An Interactive Resource for Protein Annotationof JE Virus
JEVBase: An Interactive Resource for Protein Annotationof JE Virus
 
Presentation on Biological database By Elufer Akram @ University Of Science ...
Presentation on Biological database  By Elufer Akram @ University Of Science ...Presentation on Biological database  By Elufer Akram @ University Of Science ...
Presentation on Biological database By Elufer Akram @ University Of Science ...
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
02. Biological sequence databases.pptx
02. Biological sequence databases.pptx02. Biological sequence databases.pptx
02. Biological sequence databases.pptx
 
Intro to databases
Intro to databasesIntro to databases
Intro to databases
 
Bioinformatics data mining
Bioinformatics data miningBioinformatics data mining
Bioinformatics data mining
 

Recently uploaded

Citrus Greening Disease and its Management
Citrus Greening Disease and its ManagementCitrus Greening Disease and its Management
Citrus Greening Disease and its Management
subedisuryaofficial
 
Predicting property prices with machine learning algorithms.pdf
Predicting property prices with machine learning algorithms.pdfPredicting property prices with machine learning algorithms.pdf
Predicting property prices with machine learning algorithms.pdf
binhminhvu04
 
insect morphology and physiology of insect
insect morphology and physiology of insectinsect morphology and physiology of insect
insect morphology and physiology of insect
anitaento25
 
filosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptxfilosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptx
IvanMallco1
 
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptxBody fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
muralinath2
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
AlaminAfendy1
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
Areesha Ahmad
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
Columbia Weather Systems
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
YOGESH DOGRA
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
AlguinaldoKong
 
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATIONPRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
ChetanK57
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
Scintica Instrumentation
 
Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
muralinath2
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
muralinath2
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptx
muralinath2
 
Richard's entangled aventures in wonderland
Richard's entangled aventures in wonderlandRichard's entangled aventures in wonderland
Richard's entangled aventures in wonderland
Richard Gill
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
DiyaBiswas10
 
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Sérgio Sacani
 
Comparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebratesComparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebrates
sachin783648
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
silvermistyshot
 

Recently uploaded (20)

Citrus Greening Disease and its Management
Citrus Greening Disease and its ManagementCitrus Greening Disease and its Management
Citrus Greening Disease and its Management
 
Predicting property prices with machine learning algorithms.pdf
Predicting property prices with machine learning algorithms.pdfPredicting property prices with machine learning algorithms.pdf
Predicting property prices with machine learning algorithms.pdf
 
insect morphology and physiology of insect
insect morphology and physiology of insectinsect morphology and physiology of insect
insect morphology and physiology of insect
 
filosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptxfilosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptx
 
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptxBody fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
 
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATIONPRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
 
Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptx
 
Richard's entangled aventures in wonderland
Richard's entangled aventures in wonderlandRichard's entangled aventures in wonderland
Richard's entangled aventures in wonderland
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
 
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
 
Comparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebratesComparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebrates
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
 

Ncbi

  • 1. NCBI National Centre For Biotechnology Information Site: www.ncbi.nlm.nih.gov By Richa Sharma M.Sc. Biomedical Sciences Dr. BR Ambedkar Center for Biomedical aresearch (ACBR)
  • 2. INTRODUCTION NCBI was established in the year 1988, as a part of the National Library of Medicine at the National Institutes of Health, Maryland, USA
  • 4. DIFFERENCES BETWEEN DATABASE AND TOOL DATABASE  It is a collection of data that is structured, searchable, updated periodically and cross- referenced.  Different databases are:  Genome Database  Sequence Database  Protein Database  Literature Database  Disease Database TOOL  A program that is used to extract or retrieve the desired information from the database.  Different types of tools are:  Database Retrieval Tool i.e. Entrez  BLAST  ORF Finder  ePCR  Spidey
  • 7. DATABASE RETRIEVAL TOOL- ENTREZ Entrez is an integrated database search and retrieval system that extracts information from DNA and protein sequence data, population sets, whole genome, macromolecular structures, and the biomedical literature via PubMed. Entrez provides extensive links within and between database records. http://www.ncbi.nlm.nih.gov/gquery/
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13. ARCHITECTURE OF THE ENTREZ SYSTEM
  • 14. BLAST-BASIC LOCAL ALIGNMENT SEARCH TOOL The BLAST programs perform sequence-similarity searches against a variety of sequence databases, returning a set of gapped alignments with links to full database records, to UniGene, Gene, the MMDB, or GEO. The BLAST tools available at NCBI are classified into different categories. Two important ones are:  Standard BLAST  MegaBLAST
  • 15. STANDARD BLAST Standard BLAST includes:  blastn : Comparing the nucleotide sequence query against a nucleotide sequence database.  blastp : Comparing the amino acid query against a protein sequence database.  blastx : Comparing the nucleotide query sequence translated in all reading frames against a protein database.
  • 16. • tblastn : Comparing the protein query sequence against a nucleotide database translated in all reading frames. tblastx : Comparing the six –reading frame translations of the nucleotide query against six frame translations of the nucleotide sequence database.
  • 17. MegaBLAST MegaBLAST is a program optimized for aligning long sequences. It can only work with DNA sequences, hence the only program it supports is “blastn”. It is faster than blastn but less sensitive,
  • 18. SEQUENCE SUBMISSION TO NCBI The databases are constantly updated through newer submissions of sequences, and this is done using the following sequence submission tools : 1. BankIt 2. Sequin
  • 19. BankIt BankIT is a web based GenBank sequence submission tool. It is a tool of choice for simple submissions, especially when only one or small number of records are to be submitted. It can also be used by submitters to update their existing GenBank records. Sequence analysis tools are not required for submission through this process.
  • 20. SEQUIN Sequin is a stand-alone software tool developed by NCBI which aids in submission and updating entries to the sequence databases. It helps in handling multiple sequence submissions, provides increased capacity for complex submissions containing long sequences, multiple annotations, segmented sets of DNA or phylogenetic and population studies. It also provides graphical viewing and editing options.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26. SPECIALISED TOOLS Some of the specialized tools for the sequence analysis are : 1. ORF Finder 2. e-PCR 3. Spidey
  • 27. Open Reading Frame (ORF) Finder ORF Finder is an essential graphical analysis tool, which finds all open reading frames of a selectable minimum size in a user’s sequence or in a sequence already in the database. It uses the standard or alternative genetic codes to identify all open reading frames. This is helpful in preparing complete and accurate sequence submissions. It is also packaged with the Sequin sequence submission software.
  • 28. e-PCR (Electronic Polymerase Chain Reaction) e-PCR is a computational procedure that is used to identify sequence-tagged sites (STSs) within DNA sequeces. While looking for potential STSs in DNA sequences e-PCR searches for sub-sequences that closely match the PCR primers and have the correct order, orientation, and spacing that could represent the PCR primers used to generate known STSs.The new version of e-PCr provides a search mode using a query sequence against a sequence database.
  • 29. SPIDEY This is an m-RNA to genomic alignment program ,which uses the local alignment tools like BLAST to find its alignment. Spidey takes as an input a single genomic sequence and a set of mRNA-FASTA sequences. At first, Spidey defines windows on the genomic sequence and then perform the mRNA-to-genomic alignment separately within each window to avoid including exons from paralogs and pseudogenes. It has no maximum intron size and does not favour shorter or longer introns.
  • 30. Databases  Structured collection of information.  Consists of basic units called record or enteries.  The prefect database-  Comprehensive but easy to search  Cross referenced  Minimum redundancy
  • 31. NCBI Databases  Nucleotide database  Literature database  Protein database  Gene expression database  Structural database  Chemical database  Other databases
  • 32.
  • 33. Kinds of databases Primary database  Original submissions by experimentalists.  Database staff organise but don’t add additional information.  Example - Genbank Derivative databases  Derived from primary data  Content controlled by third party.  Examples – Refseq, SWISS-PROT, unigene
  • 34. Nucleotide database  GENBANK  NCBI’s primary sequence data  It is a comprehensive public database of nucleotide sequences.  Genbank along with EMBL and DDBJ comprises the INSD.  It is a collaborative approach for exchanging data daily to ensure a uniform and comprehensive collection of sequence information.
  • 35.
  • 36. Accession numbers are labels for sequences  DNA sequences and other molecular data are tagged with accession numbers that are used to identify a sequence or other record relevant to molecular data.  It is string of letters and/or numbers that corresponds to a molecular sequence.  It is shared among the 3 collaborating databases and remains constant over the lifetime of record.  The DNA sequence within a Genbank record is also assigned a unique NCBI identifier called a ‘gi’ that apperas on the version line of flat file records following the accession number.
  • 37. Retrieval of nucleotide sequence of beta-globin gene from Xenopus laevis
  • 38.
  • 39.
  • 40.
  • 41.
  • 42.
  • 43. NCBI’s Derivative Sequence Database  RefSeq  It is a collection of non redundant set of nucleotide and protein sequences.  It is derived from the primary submissions available in the GenBank.  RefSeq records can be distinguished from GenBank records by the format of the accession series  RefSeq accession numbers are formatted as two alphabetic characters followed by an underscore ‘-’  The GenBank accession never include an underscore.
  • 44. Literature database  PMC – PubMed Central  It is a digital archive of peer-reviewed journals in the life sciences providing access to full-text articles.  All PMC free articles are identified in PubMed search results and PMC itself can be searched using Entrez.
  • 45. Retrieval of complete entry of role of remorin protein in the pubmed database
  • 46.
  • 47.
  • 48.
  • 49. Protein database  Entrez protein is the protein sequence database of NCBI.  The protein sequences in this database come from several different sources such as Swiss-Prot,PDB.  There are GenPept translations for each of the coding sequences within the GenBank nucleotide database.  The Entrez protein database is cross linked to the Entrez taxonomy database.  It is also linled to CDD.  After clicking on the individual search results of Entrez protein,the protein sequence is displayed in a particular format which is known as GenPept.
  • 50. Expression database  GEO-Gene Expression Omnibus  Distribution and regulation of the transcriptional products of normal and abnormal cell types.  SAGE map- serial analysis of gene expression map.
  • 51. Structural database  MMDB-Molecular modelling database.  3D macromolecular structures.  XRD and NMR are being used for the experimental structure determination.  These provide a wealth of information regarding the biological function,mechanism linked to the function,the evolutionary history of the function and relationship between the macromolecules.
  • 52. Chemical database  PubChem is a database of chemical molecules maintained by NCBI.  It focuses on the chemical,structural and biological properties of small molecules  Molecular mass below 2000u.
  • 53. Other databases  OMIM-Online Mendelian Inheritance in Man.  It is a comprehensive,authoritative and timely knowledge base of human genes and genetic disorders.  OMIA-Online Mendelian Inheritance in Animals.  It is a database of genes,inhertited disorders and traits in animal species other than human and mouse.