SlideShare a Scribd company logo
1 of 32
The BARCODE Data Standard as a Cross-Cultural Bridge David E. Schindel, Executive Secretary National Museum of Natural History Smithsonian Institution SchindelD@si.edu; http://www.barcoding.si.edu 202/633-0812; fax 202/633-2938
Gaining Large Scale Through Standards Are our data meant only for small segregated communities of practice or bigger audiences? Accelerate progress, Economies of scale Re-use and new use of data, synthesis, comparative analysis Shared hardware and software Standardized protocols, easier training and technical assistance Applications by non-specialists (regulatory agencies, citizen scientists, K-12 classroom)
www.e-biosphere09.org
Species Identification Matters Basic research: One more character set, but digital and calibrated Standardized yardstick for measuring variability and divergence Objective comparison across taxa, distance Links to Linnean names Triage by non-specialists for species discovery Ecology of juveniles, gut contents, fecal matter Shallow phylogenies showing history of community assemblages Subject to weaknesses of any single character (convergence, pseudogenes, introgression, etc.)
Species Identification Matters Applied research/regulation by non-specialists Agricultural pests/beneficial species Endangered/protected species  Disease vectors/pathogens Environmental quality indicators Invasive species (e.g., in ballast water) Managing for sustainable harvesting Consumer protection, ensuring food quality Fidelity of seedbanks, culture collections
6
Small ribosomal RNA The Mitochondrial Genome D-Loop DNA mtDNA Cytochrome b ND1 ND6 ND5 COI ND2 COI L-strand H-strand Typical Animal Cell ND4 ND4L COII ND3 COIII ATPase subunit 8 ATPase subunit 6 Mitochondrion An Internal ID System for All Animals
Non-COI regions for other taxa Land plants: Chloroplast matK and rbcL approved Nov 09 70-75% resolvingability, higher in angiosperms Non-coding plastid and nuclear regions being explored Fungi: CBOL Working Group met this week in Amsterdam Agreed to recommend ITS; 72% effective Protists:  CBOL Working Group July meeting, Berlin
How Barcoding Works PHASE 1: Build a barcode reference library: Well-identified specimen Tissue subsample DNA extraction, PCR amplification DNA sequencing Data submission to GenBank PHASE 2: Identify unknowns: Any unidentified juvenile, adult, fragment, product Tissue sample, DNA, sequencing Comparison with sequences in reference library
Barcode of Life Community 1,264,000 specimens already barcoded from 104,500 species Networks, Projects, Organizations ,[object Object]
 Build participation
 Working Groups
 BARCODE standard
 International   Conferences
 Increase production   of public BARCODE  records,[object Object]
BARCODE Record Flow Chart Key  Mirroring   Update Channel       Private Records USER  /GenBank
BARCODE Records in GenBank
Submission of BARCODE Records to EBI and DDBJ
BARCODE Records in INSDC Voucher Specimen Species Name Specimen Metadata GeoreferenceHabitatCharacter setsImagesBehaviorOther genes Indices - Catalogue of Life - GBIF/ECAT Nomenclators - Zoo Record - IPNI - NameBank Publication links - New species Barcode Sequence Trace files Primers Other Databases Literature(link to content or citation) PhylogeneticPop’n GeneticsEcological Databases - Provisional sp.
Linkout from GenBank to BOLD
Linkout from GenBank to Taxonomy ISBER: 13 May 2009
Link from GenBank to Museums ISBER: 13 May 2009
Darwin Core TripletStructured Link to Vouchers Institutional Acronym Collection Code Catalog ID : :
Structured Link to Vouchers : : NHM LEP 123456 : : personal DHJanzen SRNP12345
NCBI’s Biorepository List Compiled from Index Herbariorum, literature sources, GenBank submissions 6,936 records 1,177 records with non-unique acronyms 517 homonymous acronyms 374 shared by two records 143 shared by three records

More Related Content

What's hot

BIOLOGICAL SEQUENCE DATABASES
BIOLOGICAL SEQUENCE DATABASES BIOLOGICAL SEQUENCE DATABASES
BIOLOGICAL SEQUENCE DATABASES
nadeem akhter
 
American Society for Mass Spectrometry Conference 2013
American Society for Mass Spectrometry Conference 2013American Society for Mass Spectrometry Conference 2013
American Society for Mass Spectrometry Conference 2013
Dmitry Grapov
 
Providing named entity based search with a common biological database naming ...
Providing named entity based search with a common biological database naming ...Providing named entity based search with a common biological database naming ...
Providing named entity based search with a common biological database naming ...
nolmar01
 
databases in bioinformatics
databases in bioinformaticsdatabases in bioinformatics
databases in bioinformatics
nadeem akhter
 
AgriVectors: A Data and Systems Resource for Arthropod Vectors of Plant Diseases
AgriVectors: A Data and Systems Resource for Arthropod Vectors of Plant DiseasesAgriVectors: A Data and Systems Resource for Arthropod Vectors of Plant Diseases
AgriVectors: A Data and Systems Resource for Arthropod Vectors of Plant Diseases
Surya Saha
 
Biodatabases 101220022654-phpapp02
Biodatabases 101220022654-phpapp02Biodatabases 101220022654-phpapp02
Biodatabases 101220022654-phpapp02
Sreekanth Gali
 
Bioinformatics Analysis of Nucleotide Sequences
Bioinformatics Analysis of Nucleotide SequencesBioinformatics Analysis of Nucleotide Sequences
Bioinformatics Analysis of Nucleotide Sequences
Adrian Gustavo Avellaneda Vergara
 
Database technologies in bioinformatics
Database technologies in bioinformaticsDatabase technologies in bioinformatics
Database technologies in bioinformatics
Gleb Sklyr
 

What's hot (20)

ANL Soil Metagenomics 2014 Soil Reference Database - Let's do this
ANL Soil Metagenomics 2014 Soil Reference Database - Let's do thisANL Soil Metagenomics 2014 Soil Reference Database - Let's do this
ANL Soil Metagenomics 2014 Soil Reference Database - Let's do this
 
BIOLOGICAL SEQUENCE DATABASES
BIOLOGICAL SEQUENCE DATABASES BIOLOGICAL SEQUENCE DATABASES
BIOLOGICAL SEQUENCE DATABASES
 
Introduction to Bioinformatics.
 Introduction to Bioinformatics. Introduction to Bioinformatics.
Introduction to Bioinformatics.
 
American Society for Mass Spectrometry Conference 2013
American Society for Mass Spectrometry Conference 2013American Society for Mass Spectrometry Conference 2013
American Society for Mass Spectrometry Conference 2013
 
Providing named entity based search with a common biological database naming ...
Providing named entity based search with a common biological database naming ...Providing named entity based search with a common biological database naming ...
Providing named entity based search with a common biological database naming ...
 
Genomic databases
Genomic databasesGenomic databases
Genomic databases
 
Proteomics resources at the EBI & ExPASy
Proteomics resources at the EBI & ExPASyProteomics resources at the EBI & ExPASy
Proteomics resources at the EBI & ExPASy
 
The iPlant Tree of Life Project and Toolkit
The iPlant Tree of Life Project and ToolkitThe iPlant Tree of Life Project and Toolkit
The iPlant Tree of Life Project and Toolkit
 
Data base in detail
Data base in detailData base in detail
Data base in detail
 
databases in bioinformatics
databases in bioinformaticsdatabases in bioinformatics
databases in bioinformatics
 
Biological Database
Biological DatabaseBiological Database
Biological Database
 
AgriVectors: A Data and Systems Resource for Arthropod Vectors of Plant Diseases
AgriVectors: A Data and Systems Resource for Arthropod Vectors of Plant DiseasesAgriVectors: A Data and Systems Resource for Arthropod Vectors of Plant Diseases
AgriVectors: A Data and Systems Resource for Arthropod Vectors of Plant Diseases
 
Biological database by kk sahu
Biological database by kk sahuBiological database by kk sahu
Biological database by kk sahu
 
iPlant Tree of Life
iPlant Tree of LifeiPlant Tree of Life
iPlant Tree of Life
 
Biodatabases 101220022654-phpapp02
Biodatabases 101220022654-phpapp02Biodatabases 101220022654-phpapp02
Biodatabases 101220022654-phpapp02
 
Designing Biological Databases
Designing Biological DatabasesDesigning Biological Databases
Designing Biological Databases
 
Primary, secondary, tertiary biological database
Primary, secondary, tertiary biological databasePrimary, secondary, tertiary biological database
Primary, secondary, tertiary biological database
 
Bioinformatics Analysis of Nucleotide Sequences
Bioinformatics Analysis of Nucleotide SequencesBioinformatics Analysis of Nucleotide Sequences
Bioinformatics Analysis of Nucleotide Sequences
 
Database technologies in bioinformatics
Database technologies in bioinformaticsDatabase technologies in bioinformatics
Database technologies in bioinformatics
 
Databases in Bioinformatics
Databases in BioinformaticsDatabases in Bioinformatics
Databases in Bioinformatics
 

Viewers also liked (7)

West side summary
West side summaryWest side summary
West side summary
 
news paper
news papernews paper
news paper
 
Wcdmahelp blogspot-com 3 g interview
Wcdmahelp blogspot-com 3 g interviewWcdmahelp blogspot-com 3 g interview
Wcdmahelp blogspot-com 3 g interview
 
Terry brival bio
Terry brival bioTerry brival bio
Terry brival bio
 
Terry brival bio
Terry brival bioTerry brival bio
Terry brival bio
 
Batel f re v66
Batel f re v66Batel f re v66
Batel f re v66
 
Telas sap -_modulo_pp
Telas sap -_modulo_ppTelas sap -_modulo_pp
Telas sap -_modulo_pp
 

Similar to Schindel i evobio norman ok - jun 11

Keynote Speaker 1 - Data Intensive Challenges in Biodiversity Conservation: a...
Keynote Speaker 1 - Data Intensive Challenges in Biodiversity Conservation: a...Keynote Speaker 1 - Data Intensive Challenges in Biodiversity Conservation: a...
Keynote Speaker 1 - Data Intensive Challenges in Biodiversity Conservation: a...
TERN Australia
 
D paul ecn2013
D paul ecn2013D paul ecn2013
D paul ecn2013
ECNOfficer
 

Similar to Schindel i evobio norman ok - jun 11 (20)

Dr Robert Hanner - Barcode Data standards for animals, plants & fungi
Dr Robert Hanner - Barcode Data standards for animals, plants & fungiDr Robert Hanner - Barcode Data standards for animals, plants & fungi
Dr Robert Hanner - Barcode Data standards for animals, plants & fungi
 
Scott Miller - Opening Plenary
Scott Miller - Opening PlenaryScott Miller - Opening Plenary
Scott Miller - Opening Plenary
 
Keynote Speaker 1 - Data Intensive Challenges in Biodiversity Conservation: a...
Keynote Speaker 1 - Data Intensive Challenges in Biodiversity Conservation: a...Keynote Speaker 1 - Data Intensive Challenges in Biodiversity Conservation: a...
Keynote Speaker 1 - Data Intensive Challenges in Biodiversity Conservation: a...
 
Scratchpads: Building web communities supporting biodiversity science
Scratchpads: Building web communities supporting biodiversity scienceScratchpads: Building web communities supporting biodiversity science
Scratchpads: Building web communities supporting biodiversity science
 
20140623 swets agosti_final
20140623 swets agosti_final20140623 swets agosti_final
20140623 swets agosti_final
 
Scratchpads introductory presentation 45mins
Scratchpads introductory presentation   45minsScratchpads introductory presentation   45mins
Scratchpads introductory presentation 45mins
 
The role of biodiversity informatics in GBIF, 2021-05-18
The role of biodiversity informatics in GBIF, 2021-05-18The role of biodiversity informatics in GBIF, 2021-05-18
The role of biodiversity informatics in GBIF, 2021-05-18
 
The emerging biodiversity data ecosystem
The emerging biodiversity data ecosystemThe emerging biodiversity data ecosystem
The emerging biodiversity data ecosystem
 
2018 04-03-shorthouse
2018 04-03-shorthouse2018 04-03-shorthouse
2018 04-03-shorthouse
 
2021-01-27--biodiversity-informatics-gbif-(52slides)
2021-01-27--biodiversity-informatics-gbif-(52slides)2021-01-27--biodiversity-informatics-gbif-(52slides)
2021-01-27--biodiversity-informatics-gbif-(52slides)
 
2 Discovery and Acquisition of Data1.pptx
2 Discovery and Acquisition of Data1.pptx2 Discovery and Acquisition of Data1.pptx
2 Discovery and Acquisition of Data1.pptx
 
Dr Sarah Adamowicz - Ecological studies
Dr Sarah Adamowicz - Ecological studiesDr Sarah Adamowicz - Ecological studies
Dr Sarah Adamowicz - Ecological studies
 
Scott Edmunds: GigaScience - Big-Data, Data Citation and Future Data Handling
Scott Edmunds: GigaScience - Big-Data, Data Citation and Future Data HandlingScott Edmunds: GigaScience - Big-Data, Data Citation and Future Data Handling
Scott Edmunds: GigaScience - Big-Data, Data Citation and Future Data Handling
 
Behavior ontology workshop princeton
Behavior ontology workshop princetonBehavior ontology workshop princeton
Behavior ontology workshop princeton
 
Sla2009 D Curation Heidorn
Sla2009 D Curation HeidornSla2009 D Curation Heidorn
Sla2009 D Curation Heidorn
 
eScience Institute presentation on eagle-i
eScience Institute presentation on eagle-ieScience Institute presentation on eagle-i
eScience Institute presentation on eagle-i
 
GBIF and Biodiversity informatics for museums, 15 March 2021
GBIF and Biodiversity informatics for museums, 15 March 2021GBIF and Biodiversity informatics for museums, 15 March 2021
GBIF and Biodiversity informatics for museums, 15 March 2021
 
Scratchpad 2014-introduction
Scratchpad 2014-introductionScratchpad 2014-introduction
Scratchpad 2014-introduction
 
D paul ecn2013
D paul ecn2013D paul ecn2013
D paul ecn2013
 
Being FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data ScienceBeing FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data Science
 

Recently uploaded

Recently uploaded (20)

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 

Schindel i evobio norman ok - jun 11

  • 1. The BARCODE Data Standard as a Cross-Cultural Bridge David E. Schindel, Executive Secretary National Museum of Natural History Smithsonian Institution SchindelD@si.edu; http://www.barcoding.si.edu 202/633-0812; fax 202/633-2938
  • 2. Gaining Large Scale Through Standards Are our data meant only for small segregated communities of practice or bigger audiences? Accelerate progress, Economies of scale Re-use and new use of data, synthesis, comparative analysis Shared hardware and software Standardized protocols, easier training and technical assistance Applications by non-specialists (regulatory agencies, citizen scientists, K-12 classroom)
  • 4. Species Identification Matters Basic research: One more character set, but digital and calibrated Standardized yardstick for measuring variability and divergence Objective comparison across taxa, distance Links to Linnean names Triage by non-specialists for species discovery Ecology of juveniles, gut contents, fecal matter Shallow phylogenies showing history of community assemblages Subject to weaknesses of any single character (convergence, pseudogenes, introgression, etc.)
  • 5. Species Identification Matters Applied research/regulation by non-specialists Agricultural pests/beneficial species Endangered/protected species Disease vectors/pathogens Environmental quality indicators Invasive species (e.g., in ballast water) Managing for sustainable harvesting Consumer protection, ensuring food quality Fidelity of seedbanks, culture collections
  • 6. 6
  • 7.
  • 8. Small ribosomal RNA The Mitochondrial Genome D-Loop DNA mtDNA Cytochrome b ND1 ND6 ND5 COI ND2 COI L-strand H-strand Typical Animal Cell ND4 ND4L COII ND3 COIII ATPase subunit 8 ATPase subunit 6 Mitochondrion An Internal ID System for All Animals
  • 9. Non-COI regions for other taxa Land plants: Chloroplast matK and rbcL approved Nov 09 70-75% resolvingability, higher in angiosperms Non-coding plastid and nuclear regions being explored Fungi: CBOL Working Group met this week in Amsterdam Agreed to recommend ITS; 72% effective Protists: CBOL Working Group July meeting, Berlin
  • 10. How Barcoding Works PHASE 1: Build a barcode reference library: Well-identified specimen Tissue subsample DNA extraction, PCR amplification DNA sequencing Data submission to GenBank PHASE 2: Identify unknowns: Any unidentified juvenile, adult, fragment, product Tissue sample, DNA, sequencing Comparison with sequences in reference library
  • 11.
  • 15. International Conferences
  • 16.
  • 17. BARCODE Record Flow Chart Key Mirroring Update Channel Private Records USER /GenBank
  • 19. Submission of BARCODE Records to EBI and DDBJ
  • 20.
  • 21. BARCODE Records in INSDC Voucher Specimen Species Name Specimen Metadata GeoreferenceHabitatCharacter setsImagesBehaviorOther genes Indices - Catalogue of Life - GBIF/ECAT Nomenclators - Zoo Record - IPNI - NameBank Publication links - New species Barcode Sequence Trace files Primers Other Databases Literature(link to content or citation) PhylogeneticPop’n GeneticsEcological Databases - Provisional sp.
  • 22.
  • 23.
  • 24.
  • 26.
  • 27. Linkout from GenBank to Taxonomy ISBER: 13 May 2009
  • 28.
  • 29. Link from GenBank to Museums ISBER: 13 May 2009
  • 30. Darwin Core TripletStructured Link to Vouchers Institutional Acronym Collection Code Catalog ID : :
  • 31. Structured Link to Vouchers : : NHM LEP 123456 : : personal DHJanzen SRNP12345
  • 32. NCBI’s Biorepository List Compiled from Index Herbariorum, literature sources, GenBank submissions 6,936 records 1,177 records with non-unique acronyms 517 homonymous acronyms 374 shared by two records 143 shared by three records
  • 33.
  • 34. CBOL/GBIF/NCBI Registry of Biorepositories www.biorepositories.org
  • 35. Accessibility Formal naming Collaborative consensus-building of taxon concepts (CATE) Sharing of non-BARCODE data (ScratchPads) BARCODE data release with provisional nomenclature (PLoS) Specimen data release (GBIF) Comparisons, concept validation Taxon concept formation, refinement Collecting events, specimens Specimen clustering Two Taxonomic Research Processes
  • 36. Long-term data curationof BARCODE records Data records assembled in BOLD Community feedback Compliant with BARCODE standards? Update records (audit trail of species names retained) Data records released on INSDC IDs consistent with other records? GenBank adds BARCODE flag CBOL control of BARCODE flag Data records published in BOLD