SlideShare a Scribd company logo
 Develop and maintain molecular and bibliographic
databases.
 Develop software for searching, and analysis of
these data.
 Provide Web access point for data and software.
1/24/2017 2
 Sequences
 Expression
 Genome Maps
 3D Structures
 Protein Domains
 Homologous Genes,
Proteins, Structures
 Pathways
 Genetic Variation
1/24/2017
3
 Biomedical Literature
 PubMed, PubMed Central, Bookshelf
 Molecular Databases and Metadatabases
 Sequences, Structures, Variation, Chemicals etc.
 Clinical / Medical Genetics
 GTR, ClinVar, MedGen, OMIM, PubMed Health, dbGaP
1/24/2017 4
 Primary Data /Database
 Results of a particular
technique
 Submitted to NCBI
 Submitter has editorial
control
 Curated Data /Database
 Based on primary
database records
 Third party (NCBI)
maintains and updates
 Often includes additional
analyses
1/24/2017
5
 Sequences (DNA)
 GenBank (International Sequence Database Collaboration)
now 2.1 X 1012 bases
 Sequence Read Archive (SRA), Next-Gen sequence reads now 9.7
X 1015 bases!
 Other databases with a primary component
 Expression
 Gene Expression Omnibus
 RNA-Seq, Microarray, Other high throughput data
 Variation
 dbSNP small scale variants
 dbVar genomic structural studies
 Database of Genotype and Phenotype (dbGaP)
1/24/2017 6
 Sequences
 GenPept translations of CDS regions on INSDC records
 NCBI Reference Sequences (DNA and Protein)
 Variation
 NCBI Reference SNPs (non-redundant set of variants)
 Structures
 NCBI’s MMDB
 based on PDB
 Conserved Domains
 NCBI Conserved Domain Database
1/24/2017 7
 Entrez integrated literature and molecular databases
 Graphical Sequence Viewer annotation viewer and
analysis tool
 BLAST sequence similarity search service
 VAST structure similarity searches
 Cn3D 3D structure viewer
 Genome Workbench standalone sequence analysis
annotation platform
 SRA Utilities
 SRA Run Browser web access for viewing, searching and
downloading next-generation reads
 SRA toolkit standalone SRA manipulator and client
1/24/2017 8
www.ncbi.nlm.nih.gov
9
1/24/2017
10
 Literature
 PubMed, PMC, Books
 Sequences
 Protein, Nuccore, GSS, SRA, Assembly
 Expression
 GEO profiles
 Variation
 dbSNP, dbVaR
 Protein and Nucleic acid structures
 Structure
 Small Molecules
 PubChem
 Medical Genetics
 ClinVar, MedGen, GTR
1/24/2017 11
Central Resources / Databases
• Taxonomy
• BioProject
• Assembly
• Gene
Follow links to others when
needed
Nucleotide, Protein, SRA
1/24/2017
12
The Entrez system: 39 (and counting) integrated databases
1/24/2017
13
If your question is about data for ...
 an organism -> Taxonomy
 a gene name -> Gene (common organisms)
 a large-scale project -> BioProject
 a bacterial genome -> Genome
 a genome sequence -> Assembly
1/24/2017 14
Organizes gene-centered data
 Biological role; genomic context; phenotypes; interactions;
literature
 Sequences
 Genomic
 Transcript
 Proteins
 Best entry point for many biomolecular searches
 Eukaryotic and Microbial Genomes
 17.3 million records for 13,566 taxa
1/24/2017 15
 Provide a reference standard
 Represent all molecules in the central dogma
 Selected Eukaryotes
 Genomic
 Transcripts
 Proteins
 All Prokaryotes and Viruses
 Genomic and Protein only
 Maintained by NCBI staff and outside experts
 Distinct accession series
 (NC_, AC_, NG_, NM_, NM_, NR_, XM_, XR_)
1/24/2017 16
Specific gene:
XXX[Symbol] AND YYY[Organism]
APRT[Symbol] AND human[Organism]
apt[Symbol] AND Escherichia coli[Organism]
All genes:
YYY[Organism] AND current only[Filter]
zebrafish[Organism]AND current
only[Filter]
1/24/2017 17
1/24/2017
18
Protein-Structure Shortcut
1/24/2017
19
UniGene
GEO
Profiles
Expression
HomoloG
ene
Homologs
PubMed
PMC
Literature
Gene
• Genomic Structure
• Orthologs via Gpipe
Structure
Structures
SNP ClinVar
Variation
OMIMdbGaP
Nuccore
Protein
Sequences
Homologs via Blink
Proteins w Structure via
Related Strutures
SRA
20
1/24/2017
 Learn: <ncbi>/learn.shtml
 Factsheets: <ftp>/pub/factsheets/
 NCBI YouTube Channel: (www.youtube.com/ncbinlm)
 NCBI Helpdesk: info@ncbi.nlm.nih.gov
1/24/2017 21

More Related Content

What's hot

Protein databases
Protein databasesProtein databases
Protein databases
sarumalay
 
STRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICS
STRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICSSTRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICS
STRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICS
SHEETHUMOLKS
 
Introduction to databases.pptx
Introduction to databases.pptxIntroduction to databases.pptx
Introduction to databases.pptx
sworna kumari chithiraivelu
 
Introduction OF BIOLOGICAL DATABASE
Introduction OF BIOLOGICAL DATABASEIntroduction OF BIOLOGICAL DATABASE
Introduction OF BIOLOGICAL DATABASE
PrashantSharma807
 
Protein information resource (PIR)
Protein information resource (PIR)Protein information resource (PIR)
Protein information resource (PIR)
ShivaniShewale2
 
NCBI
NCBINCBI
Genomic databases
Genomic databasesGenomic databases
Protein data bank
Protein data bankProtein data bank
Protein data bank
Yogesh Joshi
 
Tools and database of NCBI
Tools and database of NCBITools and database of NCBI
Tools and database of NCBI
Santosh Kumar Sahoo
 
sequence of file formats in bioinformatics
sequence of file formats in bioinformaticssequence of file formats in bioinformatics
sequence of file formats in bioinformatics
nadeem akhter
 
Sequence homology search and multiple sequence alignment(1)
Sequence homology search and multiple sequence alignment(1)Sequence homology search and multiple sequence alignment(1)
Sequence homology search and multiple sequence alignment(1)
AnkitTiwari354
 
Biological databases
Biological databasesBiological databases
Biological databases
Sarfaraz Nasri
 
NGS: Mapping and de novo assembly
NGS: Mapping and de novo assemblyNGS: Mapping and de novo assembly
NGS: Mapping and de novo assembly
Bioinformatics and Computational Biosciences Branch
 
Est database
Est databaseEst database
Est database
Amit Ruchi Yadav
 
Gemome annotation
Gemome annotationGemome annotation
Gemome annotation
Tajammal Daultana
 
UniProt
UniProtUniProt
UniProt
AmnaA7
 
Gene identification using bioinformatic tools.pptx
Gene identification using bioinformatic tools.pptxGene identification using bioinformatic tools.pptx
Gene identification using bioinformatic tools.pptx
University of Petroleum and Energy studies
 
prediction methods for ORF
prediction methods for ORFprediction methods for ORF
prediction methods for ORF
karamveer prajapat
 
Introduction to NCBI
Introduction to NCBIIntroduction to NCBI
Introduction to NCBI
geetikaJethra
 
Gen bank databases
Gen bank databasesGen bank databases
Gen bank databases
Hafiz Muhammad Zeeshan Raza
 

What's hot (20)

Protein databases
Protein databasesProtein databases
Protein databases
 
STRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICS
STRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICSSTRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICS
STRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICS
 
Introduction to databases.pptx
Introduction to databases.pptxIntroduction to databases.pptx
Introduction to databases.pptx
 
Introduction OF BIOLOGICAL DATABASE
Introduction OF BIOLOGICAL DATABASEIntroduction OF BIOLOGICAL DATABASE
Introduction OF BIOLOGICAL DATABASE
 
Protein information resource (PIR)
Protein information resource (PIR)Protein information resource (PIR)
Protein information resource (PIR)
 
NCBI
NCBINCBI
NCBI
 
Genomic databases
Genomic databasesGenomic databases
Genomic databases
 
Protein data bank
Protein data bankProtein data bank
Protein data bank
 
Tools and database of NCBI
Tools and database of NCBITools and database of NCBI
Tools and database of NCBI
 
sequence of file formats in bioinformatics
sequence of file formats in bioinformaticssequence of file formats in bioinformatics
sequence of file formats in bioinformatics
 
Sequence homology search and multiple sequence alignment(1)
Sequence homology search and multiple sequence alignment(1)Sequence homology search and multiple sequence alignment(1)
Sequence homology search and multiple sequence alignment(1)
 
Biological databases
Biological databasesBiological databases
Biological databases
 
NGS: Mapping and de novo assembly
NGS: Mapping and de novo assemblyNGS: Mapping and de novo assembly
NGS: Mapping and de novo assembly
 
Est database
Est databaseEst database
Est database
 
Gemome annotation
Gemome annotationGemome annotation
Gemome annotation
 
UniProt
UniProtUniProt
UniProt
 
Gene identification using bioinformatic tools.pptx
Gene identification using bioinformatic tools.pptxGene identification using bioinformatic tools.pptx
Gene identification using bioinformatic tools.pptx
 
prediction methods for ORF
prediction methods for ORFprediction methods for ORF
prediction methods for ORF
 
Introduction to NCBI
Introduction to NCBIIntroduction to NCBI
Introduction to NCBI
 
Gen bank databases
Gen bank databasesGen bank databases
Gen bank databases
 

Viewers also liked

Genomic futures v_pitt_kent_osu
Genomic futures v_pitt_kent_osuGenomic futures v_pitt_kent_osu
Genomic futures v_pitt_kent_osu
Ben Busby
 
Advanced genomics v_medical_pitt_kent_osu
Advanced genomics v_medical_pitt_kent_osuAdvanced genomics v_medical_pitt_kent_osu
Advanced genomics v_medical_pitt_kent_osu
Ben Busby
 
Genomic Futures v2
Genomic Futures v2 Genomic Futures v2
Genomic Futures v2
Ben Busby
 
Umcp cs talk_11_3_16_v1
Umcp cs talk_11_3_16_v1Umcp cs talk_11_3_16_v1
Umcp cs talk_11_3_16_v1
Ben Busby
 
Genomic futures v4
Genomic futures v4Genomic futures v4
Genomic futures v4
Ben Busby
 
Next Generation Preprint Service
Next Generation Preprint ServiceNext Generation Preprint Service
Next Generation Preprint Service
Philip Bourne
 
Nutrigenomics metadata 12_1_16_v4 (1)
Nutrigenomics metadata 12_1_16_v4 (1)Nutrigenomics metadata 12_1_16_v4 (1)
Nutrigenomics metadata 12_1_16_v4 (1)
Ben Busby
 
Tips for effective use of BLAST and other NCBI tools
Tips for effective use of BLAST and other NCBI toolsTips for effective use of BLAST and other NCBI tools
Tips for effective use of BLAST and other NCBI tools
Integrated DNA Technologies
 
Experiences In Building Globus Genomics Using Galaxy, Globus Online and AWS
Experiences In Building Globus Genomics Using Galaxy, Globus Online and AWSExperiences In Building Globus Genomics Using Galaxy, Globus Online and AWS
Experiences In Building Globus Genomics Using Galaxy, Globus Online and AWS
Ed Dodds
 
Big Data and Genomics
Big Data and GenomicsBig Data and Genomics
Big Data and Genomics
Al Costa
 
Internet2 Bio IT 2016 v2
Internet2 Bio IT 2016 v2Internet2 Bio IT 2016 v2
Internet2 Bio IT 2016 v2
Dan Taylor
 
Role of Amyloid Burden in cognitive decline
Role of Amyloid Burden in cognitive decline Role of Amyloid Burden in cognitive decline
Role of Amyloid Burden in cognitive decline
Ravi Madduri
 
Public.Cdsc.Middleton
Public.Cdsc.MiddletonPublic.Cdsc.Middleton
Effective ansible
Effective ansibleEffective ansible
Effective ansible
Wu Bigo
 
HL7: Clinical Decision Support
HL7: Clinical Decision SupportHL7: Clinical Decision Support
HL7: Clinical Decision Support
Harvard Medical School, Partners Healthcare
 
Jsm madduri-august-2015
Jsm madduri-august-2015Jsm madduri-august-2015
Jsm madduri-august-2015
Ravi Madduri
 
ADAS&ME presentation @ the SCOUT project expert workshop (22-02-2017, Brussels)
ADAS&ME presentation @ the SCOUT project expert workshop (22-02-2017, Brussels)ADAS&ME presentation @ the SCOUT project expert workshop (22-02-2017, Brussels)
ADAS&ME presentation @ the SCOUT project expert workshop (22-02-2017, Brussels)
joseplaborda
 
CI4CC sustainability-panel
CI4CC sustainability-panelCI4CC sustainability-panel
CI4CC sustainability-panel
Ravi Madduri
 
re:Invent 2013-foster-madduri
re:Invent 2013-foster-maddurire:Invent 2013-foster-madduri
re:Invent 2013-foster-madduri
Ravi Madduri
 
Supporting Barack Obama for President
Supporting Barack Obama for PresidentSupporting Barack Obama for President
Supporting Barack Obama for President
Harvard Medical School, Partners Healthcare
 

Viewers also liked (20)

Genomic futures v_pitt_kent_osu
Genomic futures v_pitt_kent_osuGenomic futures v_pitt_kent_osu
Genomic futures v_pitt_kent_osu
 
Advanced genomics v_medical_pitt_kent_osu
Advanced genomics v_medical_pitt_kent_osuAdvanced genomics v_medical_pitt_kent_osu
Advanced genomics v_medical_pitt_kent_osu
 
Genomic Futures v2
Genomic Futures v2 Genomic Futures v2
Genomic Futures v2
 
Umcp cs talk_11_3_16_v1
Umcp cs talk_11_3_16_v1Umcp cs talk_11_3_16_v1
Umcp cs talk_11_3_16_v1
 
Genomic futures v4
Genomic futures v4Genomic futures v4
Genomic futures v4
 
Next Generation Preprint Service
Next Generation Preprint ServiceNext Generation Preprint Service
Next Generation Preprint Service
 
Nutrigenomics metadata 12_1_16_v4 (1)
Nutrigenomics metadata 12_1_16_v4 (1)Nutrigenomics metadata 12_1_16_v4 (1)
Nutrigenomics metadata 12_1_16_v4 (1)
 
Tips for effective use of BLAST and other NCBI tools
Tips for effective use of BLAST and other NCBI toolsTips for effective use of BLAST and other NCBI tools
Tips for effective use of BLAST and other NCBI tools
 
Experiences In Building Globus Genomics Using Galaxy, Globus Online and AWS
Experiences In Building Globus Genomics Using Galaxy, Globus Online and AWSExperiences In Building Globus Genomics Using Galaxy, Globus Online and AWS
Experiences In Building Globus Genomics Using Galaxy, Globus Online and AWS
 
Big Data and Genomics
Big Data and GenomicsBig Data and Genomics
Big Data and Genomics
 
Internet2 Bio IT 2016 v2
Internet2 Bio IT 2016 v2Internet2 Bio IT 2016 v2
Internet2 Bio IT 2016 v2
 
Role of Amyloid Burden in cognitive decline
Role of Amyloid Burden in cognitive decline Role of Amyloid Burden in cognitive decline
Role of Amyloid Burden in cognitive decline
 
Public.Cdsc.Middleton
Public.Cdsc.MiddletonPublic.Cdsc.Middleton
Public.Cdsc.Middleton
 
Effective ansible
Effective ansibleEffective ansible
Effective ansible
 
HL7: Clinical Decision Support
HL7: Clinical Decision SupportHL7: Clinical Decision Support
HL7: Clinical Decision Support
 
Jsm madduri-august-2015
Jsm madduri-august-2015Jsm madduri-august-2015
Jsm madduri-august-2015
 
ADAS&ME presentation @ the SCOUT project expert workshop (22-02-2017, Brussels)
ADAS&ME presentation @ the SCOUT project expert workshop (22-02-2017, Brussels)ADAS&ME presentation @ the SCOUT project expert workshop (22-02-2017, Brussels)
ADAS&ME presentation @ the SCOUT project expert workshop (22-02-2017, Brussels)
 
CI4CC sustainability-panel
CI4CC sustainability-panelCI4CC sustainability-panel
CI4CC sustainability-panel
 
re:Invent 2013-foster-madduri
re:Invent 2013-foster-maddurire:Invent 2013-foster-madduri
re:Invent 2013-foster-madduri
 
Supporting Barack Obama for President
Supporting Barack Obama for PresidentSupporting Barack Obama for President
Supporting Barack Obama for President
 

Similar to Ncbi basic intro_v_pitt_kent_osu

Biological databases
Biological databasesBiological databases
Biological databases
Ashfaq Ahmad
 
Kim Pruitt trainingbiocuration2015
Kim Pruitt trainingbiocuration2015Kim Pruitt trainingbiocuration2015
Kim Pruitt trainingbiocuration2015
Kim D. Pruitt
 
DNA Sequence Data in Big Data Perspective
DNA Sequence Data in Big Data PerspectiveDNA Sequence Data in Big Data Perspective
DNA Sequence Data in Big Data Perspective
Palaniappan SP
 
Next-Generation Sequencing and Data Analysis.pptx
Next-Generation Sequencing and Data Analysis.pptxNext-Generation Sequencing and Data Analysis.pptx
Next-Generation Sequencing and Data Analysis.pptx
SwetaTripathi13
 
Storing and Accessing Information. Databases and Queries (UEB-UAT Bioinformat...
Storing and Accessing Information. Databases and Queries (UEB-UAT Bioinformat...Storing and Accessing Information. Databases and Queries (UEB-UAT Bioinformat...
Storing and Accessing Information. Databases and Queries (UEB-UAT Bioinformat...
VHIR Vall d’Hebron Institut de Recerca
 
Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
 Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ... Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
Syed Ahmad Chan Bukhari, PhD
 
Bioinformatics مي.pdf
Bioinformatics  مي.pdfBioinformatics  مي.pdf
Bioinformatics مي.pdf
nedalalazzwy
 
Database in bioinformatics
Database in bioinformaticsDatabase in bioinformatics
Database in bioinformatics
VinaKhan1
 
Karyotype DAS client
Karyotype DAS clientKaryotype DAS client
Karyotype DAS client
Rafael C. Jimenez
 
Informal presentation on bioinformatics
Informal presentation on bioinformaticsInformal presentation on bioinformatics
Informal presentation on bioinformatics
Atai Rabby
 
Building bioinformatics resources for the global community
Building bioinformatics resources for the global communityBuilding bioinformatics resources for the global community
Building bioinformatics resources for the global community
ExternalEvents
 
Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009
Ian Foster
 
Presentation from Code Camp 2017
Presentation from Code Camp 2017Presentation from Code Camp 2017
Presentation from Code Camp 2017
Mitch Miller
 
Jax bio dataworldcongress.ngs.20181128finalwithoutbu
Jax bio dataworldcongress.ngs.20181128finalwithoutbuJax bio dataworldcongress.ngs.20181128finalwithoutbu
Jax bio dataworldcongress.ngs.20181128finalwithoutbu
Anne Deslattes Mays
 
Data retriveal ,srg and dbget
Data retriveal ,srg and dbgetData retriveal ,srg and dbget
Data retriveal ,srg and dbget
SurendraKumar338
 
database retrival.pdf
database retrival.pdfdatabase retrival.pdf
database retrival.pdf
SrimathideviJ
 
Final Acb All Hands 26 11 07.Key
Final Acb All Hands 26 11 07.KeyFinal Acb All Hands 26 11 07.Key
Final Acb All Hands 26 11 07.Key
guest3d0531
 
FAIR as a Working Principle for Cancer Genomic Data
FAIR as a Working Principle for Cancer Genomic DataFAIR as a Working Principle for Cancer Genomic Data
FAIR as a Working Principle for Cancer Genomic Data
Ian Fore
 
BIOLOGICAL SEQUENCE DATABASES
BIOLOGICAL SEQUENCE DATABASES BIOLOGICAL SEQUENCE DATABASES
BIOLOGICAL SEQUENCE DATABASES
nadeem akhter
 
Intro to databases
Intro to databasesIntro to databases
Intro to databases
bhargvi sharma
 

Similar to Ncbi basic intro_v_pitt_kent_osu (20)

Biological databases
Biological databasesBiological databases
Biological databases
 
Kim Pruitt trainingbiocuration2015
Kim Pruitt trainingbiocuration2015Kim Pruitt trainingbiocuration2015
Kim Pruitt trainingbiocuration2015
 
DNA Sequence Data in Big Data Perspective
DNA Sequence Data in Big Data PerspectiveDNA Sequence Data in Big Data Perspective
DNA Sequence Data in Big Data Perspective
 
Next-Generation Sequencing and Data Analysis.pptx
Next-Generation Sequencing and Data Analysis.pptxNext-Generation Sequencing and Data Analysis.pptx
Next-Generation Sequencing and Data Analysis.pptx
 
Storing and Accessing Information. Databases and Queries (UEB-UAT Bioinformat...
Storing and Accessing Information. Databases and Queries (UEB-UAT Bioinformat...Storing and Accessing Information. Databases and Queries (UEB-UAT Bioinformat...
Storing and Accessing Information. Databases and Queries (UEB-UAT Bioinformat...
 
Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
 Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ... Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
 
Bioinformatics مي.pdf
Bioinformatics  مي.pdfBioinformatics  مي.pdf
Bioinformatics مي.pdf
 
Database in bioinformatics
Database in bioinformaticsDatabase in bioinformatics
Database in bioinformatics
 
Karyotype DAS client
Karyotype DAS clientKaryotype DAS client
Karyotype DAS client
 
Informal presentation on bioinformatics
Informal presentation on bioinformaticsInformal presentation on bioinformatics
Informal presentation on bioinformatics
 
Building bioinformatics resources for the global community
Building bioinformatics resources for the global communityBuilding bioinformatics resources for the global community
Building bioinformatics resources for the global community
 
Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009
 
Presentation from Code Camp 2017
Presentation from Code Camp 2017Presentation from Code Camp 2017
Presentation from Code Camp 2017
 
Jax bio dataworldcongress.ngs.20181128finalwithoutbu
Jax bio dataworldcongress.ngs.20181128finalwithoutbuJax bio dataworldcongress.ngs.20181128finalwithoutbu
Jax bio dataworldcongress.ngs.20181128finalwithoutbu
 
Data retriveal ,srg and dbget
Data retriveal ,srg and dbgetData retriveal ,srg and dbget
Data retriveal ,srg and dbget
 
database retrival.pdf
database retrival.pdfdatabase retrival.pdf
database retrival.pdf
 
Final Acb All Hands 26 11 07.Key
Final Acb All Hands 26 11 07.KeyFinal Acb All Hands 26 11 07.Key
Final Acb All Hands 26 11 07.Key
 
FAIR as a Working Principle for Cancer Genomic Data
FAIR as a Working Principle for Cancer Genomic DataFAIR as a Working Principle for Cancer Genomic Data
FAIR as a Working Principle for Cancer Genomic Data
 
BIOLOGICAL SEQUENCE DATABASES
BIOLOGICAL SEQUENCE DATABASES BIOLOGICAL SEQUENCE DATABASES
BIOLOGICAL SEQUENCE DATABASES
 
Intro to databases
Intro to databasesIntro to databases
Intro to databases
 

More from Ben Busby

Addressing privacy concerns_in_the_age_of_federated_data_access
Addressing privacy concerns_in_the_age_of_federated_data_accessAddressing privacy concerns_in_the_age_of_federated_data_access
Addressing privacy concerns_in_the_age_of_federated_data_access
Ben Busby
 
Containerized attribute indexing and graph genomes for federated data access
Containerized attribute indexing and graph genomes for federated data accessContainerized attribute indexing and graph genomes for federated data access
Containerized attribute indexing and graph genomes for federated data access
Ben Busby
 
Artificial_Intelligence_for_Data_Reuse_2019
Artificial_Intelligence_for_Data_Reuse_2019Artificial_Intelligence_for_Data_Reuse_2019
Artificial_Intelligence_for_Data_Reuse_2019
Ben Busby
 
Dream.recomb.ncbi.hackathons v003
Dream.recomb.ncbi.hackathons v003Dream.recomb.ncbi.hackathons v003
Dream.recomb.ncbi.hackathons v003
Ben Busby
 
Human_Pangenomics_Bio-IT_2019
Human_Pangenomics_Bio-IT_2019Human_Pangenomics_Bio-IT_2019
Human_Pangenomics_Bio-IT_2019
Ben Busby
 
RNAML_Bio-IT_2019
RNAML_Bio-IT_2019RNAML_Bio-IT_2019
RNAML_Bio-IT_2019
Ben Busby
 
Hackathon_Bio-IT_2019
Hackathon_Bio-IT_2019Hackathon_Bio-IT_2019
Hackathon_Bio-IT_2019
Ben Busby
 
Data science futures_v_vu2
Data science futures_v_vu2Data science futures_v_vu2
Data science futures_v_vu2
Ben Busby
 
Sage 2 19_v5_busby
Sage 2 19_v5_busbySage 2 19_v5_busby
Sage 2 19_v5_busby
Ben Busby
 
Bb health ai_jan26_v2
Bb health ai_jan26_v2Bb health ai_jan26_v2
Bb health ai_jan26_v2
Ben Busby
 
BB_NCBI_PAG_2019_Workshop
BB_NCBI_PAG_2019_WorkshopBB_NCBI_PAG_2019_Workshop
BB_NCBI_PAG_2019_Workshop
Ben Busby
 
Hackathons lightning v_nbs
Hackathons lightning v_nbsHackathons lightning v_nbs
Hackathons lightning v_nbs
Ben Busby
 
Cmu oss 18
Cmu oss 18Cmu oss 18
Cmu oss 18
Ben Busby
 
Genome web v_repro1
Genome web v_repro1Genome web v_repro1
Genome web v_repro1
Ben Busby
 
Data science futures_v_une
Data science futures_v_uneData science futures_v_une
Data science futures_v_une
Ben Busby
 
Variant and disease_grs_kickoff
Variant and disease_grs_kickoffVariant and disease_grs_kickoff
Variant and disease_grs_kickoff
Ben Busby
 
Bioinformatics_resources_SVAI_v2
Bioinformatics_resources_SVAI_v2Bioinformatics_resources_SVAI_v2
Bioinformatics_resources_SVAI_v2
Ben Busby
 
Ncbi resources i5_k_v4
Ncbi resources i5_k_v4Ncbi resources i5_k_v4
Ncbi resources i5_k_v4
Ben Busby
 
Ncbi resources abrf_v3
Ncbi resources abrf_v3Ncbi resources abrf_v3
Ncbi resources abrf_v3
Ben Busby
 
Data science futures_v_lbirn
Data science futures_v_lbirnData science futures_v_lbirn
Data science futures_v_lbirn
Ben Busby
 

More from Ben Busby (20)

Addressing privacy concerns_in_the_age_of_federated_data_access
Addressing privacy concerns_in_the_age_of_federated_data_accessAddressing privacy concerns_in_the_age_of_federated_data_access
Addressing privacy concerns_in_the_age_of_federated_data_access
 
Containerized attribute indexing and graph genomes for federated data access
Containerized attribute indexing and graph genomes for federated data accessContainerized attribute indexing and graph genomes for federated data access
Containerized attribute indexing and graph genomes for federated data access
 
Artificial_Intelligence_for_Data_Reuse_2019
Artificial_Intelligence_for_Data_Reuse_2019Artificial_Intelligence_for_Data_Reuse_2019
Artificial_Intelligence_for_Data_Reuse_2019
 
Dream.recomb.ncbi.hackathons v003
Dream.recomb.ncbi.hackathons v003Dream.recomb.ncbi.hackathons v003
Dream.recomb.ncbi.hackathons v003
 
Human_Pangenomics_Bio-IT_2019
Human_Pangenomics_Bio-IT_2019Human_Pangenomics_Bio-IT_2019
Human_Pangenomics_Bio-IT_2019
 
RNAML_Bio-IT_2019
RNAML_Bio-IT_2019RNAML_Bio-IT_2019
RNAML_Bio-IT_2019
 
Hackathon_Bio-IT_2019
Hackathon_Bio-IT_2019Hackathon_Bio-IT_2019
Hackathon_Bio-IT_2019
 
Data science futures_v_vu2
Data science futures_v_vu2Data science futures_v_vu2
Data science futures_v_vu2
 
Sage 2 19_v5_busby
Sage 2 19_v5_busbySage 2 19_v5_busby
Sage 2 19_v5_busby
 
Bb health ai_jan26_v2
Bb health ai_jan26_v2Bb health ai_jan26_v2
Bb health ai_jan26_v2
 
BB_NCBI_PAG_2019_Workshop
BB_NCBI_PAG_2019_WorkshopBB_NCBI_PAG_2019_Workshop
BB_NCBI_PAG_2019_Workshop
 
Hackathons lightning v_nbs
Hackathons lightning v_nbsHackathons lightning v_nbs
Hackathons lightning v_nbs
 
Cmu oss 18
Cmu oss 18Cmu oss 18
Cmu oss 18
 
Genome web v_repro1
Genome web v_repro1Genome web v_repro1
Genome web v_repro1
 
Data science futures_v_une
Data science futures_v_uneData science futures_v_une
Data science futures_v_une
 
Variant and disease_grs_kickoff
Variant and disease_grs_kickoffVariant and disease_grs_kickoff
Variant and disease_grs_kickoff
 
Bioinformatics_resources_SVAI_v2
Bioinformatics_resources_SVAI_v2Bioinformatics_resources_SVAI_v2
Bioinformatics_resources_SVAI_v2
 
Ncbi resources i5_k_v4
Ncbi resources i5_k_v4Ncbi resources i5_k_v4
Ncbi resources i5_k_v4
 
Ncbi resources abrf_v3
Ncbi resources abrf_v3Ncbi resources abrf_v3
Ncbi resources abrf_v3
 
Data science futures_v_lbirn
Data science futures_v_lbirnData science futures_v_lbirn
Data science futures_v_lbirn
 

Recently uploaded

Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
University of Rennes, INSA Rennes, Inria/IRISA, CNRS
 
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
Abdul Wali Khan University Mardan,kP,Pakistan
 
Applied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdfApplied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdf
University of Hertfordshire
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
Sharon Liu
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
University of Maribor
 
Cytokines and their role in immune regulation.pptx
Cytokines and their role in immune regulation.pptxCytokines and their role in immune regulation.pptx
Cytokines and their role in immune regulation.pptx
Hitesh Sikarwar
 
The debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically youngThe debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically young
Sérgio Sacani
 
NuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyerNuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyer
pablovgd
 
Equivariant neural networks and representation theory
Equivariant neural networks and representation theoryEquivariant neural networks and representation theory
Equivariant neural networks and representation theory
Daniel Tubbenhauer
 
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxThe use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
MAGOTI ERNEST
 
Phenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvementPhenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvement
IshaGoswami9
 
8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf
by6843629
 
aziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobelaziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobel
İsa Badur
 
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
David Osipyan
 
Shallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptxShallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptx
Gokturk Mehmet Dilci
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
moosaasad1975
 
Thornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdfThornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdf
European Sustainable Phosphorus Platform
 
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptxANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
RASHMI M G
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
Sérgio Sacani
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
Nistarini College, Purulia (W.B) India
 

Recently uploaded (20)

Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
 
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
 
Applied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdfApplied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdf
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
 
Cytokines and their role in immune regulation.pptx
Cytokines and their role in immune regulation.pptxCytokines and their role in immune regulation.pptx
Cytokines and their role in immune regulation.pptx
 
The debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically youngThe debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically young
 
NuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyerNuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyer
 
Equivariant neural networks and representation theory
Equivariant neural networks and representation theoryEquivariant neural networks and representation theory
Equivariant neural networks and representation theory
 
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxThe use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
 
Phenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvementPhenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvement
 
8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf
 
aziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobelaziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobel
 
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
 
Shallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptxShallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptx
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
 
Thornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdfThornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdf
 
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptxANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
 

Ncbi basic intro_v_pitt_kent_osu

  • 1.
  • 2.  Develop and maintain molecular and bibliographic databases.  Develop software for searching, and analysis of these data.  Provide Web access point for data and software. 1/24/2017 2
  • 3.  Sequences  Expression  Genome Maps  3D Structures  Protein Domains  Homologous Genes, Proteins, Structures  Pathways  Genetic Variation 1/24/2017 3
  • 4.  Biomedical Literature  PubMed, PubMed Central, Bookshelf  Molecular Databases and Metadatabases  Sequences, Structures, Variation, Chemicals etc.  Clinical / Medical Genetics  GTR, ClinVar, MedGen, OMIM, PubMed Health, dbGaP 1/24/2017 4
  • 5.  Primary Data /Database  Results of a particular technique  Submitted to NCBI  Submitter has editorial control  Curated Data /Database  Based on primary database records  Third party (NCBI) maintains and updates  Often includes additional analyses 1/24/2017 5
  • 6.  Sequences (DNA)  GenBank (International Sequence Database Collaboration) now 2.1 X 1012 bases  Sequence Read Archive (SRA), Next-Gen sequence reads now 9.7 X 1015 bases!  Other databases with a primary component  Expression  Gene Expression Omnibus  RNA-Seq, Microarray, Other high throughput data  Variation  dbSNP small scale variants  dbVar genomic structural studies  Database of Genotype and Phenotype (dbGaP) 1/24/2017 6
  • 7.  Sequences  GenPept translations of CDS regions on INSDC records  NCBI Reference Sequences (DNA and Protein)  Variation  NCBI Reference SNPs (non-redundant set of variants)  Structures  NCBI’s MMDB  based on PDB  Conserved Domains  NCBI Conserved Domain Database 1/24/2017 7
  • 8.  Entrez integrated literature and molecular databases  Graphical Sequence Viewer annotation viewer and analysis tool  BLAST sequence similarity search service  VAST structure similarity searches  Cn3D 3D structure viewer  Genome Workbench standalone sequence analysis annotation platform  SRA Utilities  SRA Run Browser web access for viewing, searching and downloading next-generation reads  SRA toolkit standalone SRA manipulator and client 1/24/2017 8
  • 11.  Literature  PubMed, PMC, Books  Sequences  Protein, Nuccore, GSS, SRA, Assembly  Expression  GEO profiles  Variation  dbSNP, dbVaR  Protein and Nucleic acid structures  Structure  Small Molecules  PubChem  Medical Genetics  ClinVar, MedGen, GTR 1/24/2017 11
  • 12. Central Resources / Databases • Taxonomy • BioProject • Assembly • Gene Follow links to others when needed Nucleotide, Protein, SRA 1/24/2017 12 The Entrez system: 39 (and counting) integrated databases
  • 14. If your question is about data for ...  an organism -> Taxonomy  a gene name -> Gene (common organisms)  a large-scale project -> BioProject  a bacterial genome -> Genome  a genome sequence -> Assembly 1/24/2017 14
  • 15. Organizes gene-centered data  Biological role; genomic context; phenotypes; interactions; literature  Sequences  Genomic  Transcript  Proteins  Best entry point for many biomolecular searches  Eukaryotic and Microbial Genomes  17.3 million records for 13,566 taxa 1/24/2017 15
  • 16.  Provide a reference standard  Represent all molecules in the central dogma  Selected Eukaryotes  Genomic  Transcripts  Proteins  All Prokaryotes and Viruses  Genomic and Protein only  Maintained by NCBI staff and outside experts  Distinct accession series  (NC_, AC_, NG_, NM_, NM_, NR_, XM_, XR_) 1/24/2017 16
  • 17. Specific gene: XXX[Symbol] AND YYY[Organism] APRT[Symbol] AND human[Organism] apt[Symbol] AND Escherichia coli[Organism] All genes: YYY[Organism] AND current only[Filter] zebrafish[Organism]AND current only[Filter] 1/24/2017 17
  • 20. UniGene GEO Profiles Expression HomoloG ene Homologs PubMed PMC Literature Gene • Genomic Structure • Orthologs via Gpipe Structure Structures SNP ClinVar Variation OMIMdbGaP Nuccore Protein Sequences Homologs via Blink Proteins w Structure via Related Strutures SRA 20 1/24/2017
  • 21.  Learn: <ncbi>/learn.shtml  Factsheets: <ftp>/pub/factsheets/  NCBI YouTube Channel: (www.youtube.com/ncbinlm)  NCBI Helpdesk: info@ncbi.nlm.nih.gov 1/24/2017 21