Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Schindel i evobio norman ok - jun 11


Published on

DNA Barcode Data Standards presentation at the iEvoBio (Informatics for Evolutionary Biology) meeting in Norman, OK, 22 June 2011

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Schindel i evobio norman ok - jun 11

  1. 1. The BARCODE Data Standard as a Cross-Cultural Bridge<br />David E. Schindel, Executive Secretary<br />National Museum of Natural History<br />Smithsonian Institution<br />;<br />202/633-0812; fax 202/633-2938<br />
  2. 2. Gaining Large Scale Through Standards<br />Are our data meant only for small segregated communities of practice or bigger audiences?<br />Accelerate progress, Economies of scale<br />Re-use and new use of data, synthesis, comparative analysis<br />Shared hardware and software<br />Standardized protocols, easier training and technical assistance<br />Applications by non-specialists (regulatory agencies, citizen scientists, K-12 classroom)<br />
  3. 3.<br />
  4. 4. Species Identification Matters<br />Basic research:<br />One more character set, but digital and calibrated<br />Standardized yardstick for measuring variability and divergence<br />Objective comparison across taxa, distance<br />Links to Linnean names<br />Triage by non-specialists for species discovery<br />Ecology of juveniles, gut contents, fecal matter<br />Shallow phylogenies showing history of community assemblages<br />Subject to weaknesses of any single character (convergence, pseudogenes, introgression, etc.)<br />
  5. 5. Species Identification Matters<br />Applied research/regulation by non-specialists<br />Agricultural pests/beneficial species<br />Endangered/protected species <br />Disease vectors/pathogens<br />Environmental quality indicators<br />Invasive species (e.g., in ballast water)<br />Managing for sustainable harvesting<br />Consumer protection, ensuring food quality<br />Fidelity of seedbanks, culture collections<br />
  6. 6. 6<br />
  7. 7.
  8. 8. Small ribosomal RNA<br />The Mitochondrial Genome<br />D-Loop<br />DNA<br />mtDNA<br />Cytochrome b<br />ND1<br />ND6<br />ND5<br />COI<br />ND2<br />COI<br />L-strand<br />H-strand<br />Typical Animal Cell<br />ND4<br />ND4L<br />COII<br />ND3<br />COIII<br />ATPase subunit 8<br />ATPase subunit 6<br />Mitochondrion<br />An Internal ID System for All Animals<br />
  9. 9. Non-COI regions for other taxa<br />Land plants:<br />Chloroplast matK and rbcL approved Nov 09<br />70-75% resolvingability, higher in angiosperms<br />Non-coding plastid and nuclear regions being explored<br />Fungi:<br />CBOL Working Group met this week in Amsterdam<br />Agreed to recommend ITS; 72% effective<br />Protists:<br /> CBOL Working Group July meeting, Berlin<br />
  10. 10. How Barcoding Works<br />PHASE 1: Build a barcode reference library:<br />Well-identified specimen<br />Tissue subsample<br />DNA extraction, PCR amplification<br />DNA sequencing<br />Data submission to GenBank<br />PHASE 2: Identify unknowns:<br />Any unidentified juvenile, adult, fragment, product<br />Tissue sample, DNA, sequencing<br />Comparison with sequences in reference library<br />
  11. 11. Barcode of Life Community<br />1,264,000 specimens already barcoded from 104,500 species<br />Networks, Projects, Organizations<br /><ul><li> Promote barcoding as a global standard
  12. 12. Build participation
  13. 13. Working Groups
  14. 14. BARCODE standard
  15. 15. International Conferences
  16. 16. Increase production of public BARCODE records</li></li></ul><li>Barcode of Life Data Systems (BOLD)<br />University of Guelph<br />Workbench with 1.27M records, 105K species/OTUs<br />
  17. 17. BARCODE Record Flow Chart<br />Key<br /> Mirroring <br /> Update Channel<br /> Private Records<br />USER<br /> /GenBank<br />
  18. 18. BARCODE Records in GenBank<br />
  19. 19. Submission of BARCODE Records to EBI and DDBJ<br />
  20. 20.
  21. 21. BARCODE Records in INSDC<br />Voucher Specimen<br />Species Name<br />Specimen Metadata<br />GeoreferenceHabitatCharacter setsImagesBehaviorOther genes<br />Indices - Catalogue of Life - GBIF/ECAT<br />Nomenclators - Zoo Record - IPNI - NameBank<br />Publication links - New species<br />Barcode Sequence<br />Trace files<br />Primers<br />Other Databases<br />Literature(link to content or citation)<br />PhylogeneticPop’n GeneticsEcological<br />Databases - Provisional sp.<br />
  22. 22.
  23. 23.
  24. 24.
  25. 25. Linkout from GenBank to BOLD<br />
  26. 26.
  27. 27. Linkout from GenBank to Taxonomy<br />ISBER: 13 May 2009<br />
  28. 28.
  29. 29. Link from GenBank to Museums<br />ISBER: 13 May 2009<br />
  30. 30. Darwin Core TripletStructured Link to Vouchers<br />Institutional Acronym<br />Collection Code<br />Catalog ID<br />:<br />:<br />
  31. 31. Structured Link to Vouchers<br />:<br />:<br />NHM<br />LEP<br />123456<br />:<br />:<br />personal<br />DHJanzen<br />SRNP12345<br />
  32. 32. NCBI’s Biorepository List<br />Compiled from Index Herbariorum, literature sources, GenBank submissions<br />6,936 records<br />1,177 records with non-unique acronyms<br />517 homonymous acronyms<br />374 shared by two records<br />143 shared by three records<br />
  33. 33.
  34. 34. CBOL/GBIF/NCBI Registry of Biorepositories<br /><br />
  35. 35. Accessibility<br />Formal naming<br />Collaborative consensus-building of taxon concepts (CATE)<br />Sharing of non-BARCODE data (ScratchPads)<br />BARCODE data release with provisional nomenclature (PLoS)<br />Specimen data release (GBIF)<br />Comparisons, concept validation<br />Taxon concept formation, refinement<br />Collecting events, specimens<br />Specimen clustering<br />Two Taxonomic Research Processes<br />
  36. 36. Long-term data curationof BARCODE records<br />Data records assembled in BOLD<br />Community feedback<br />Compliant with BARCODE standards?<br />Update records (audit trail of species names retained)<br />Data records released on INSDC<br />IDs consistent with other records?<br />GenBank adds BARCODE flag<br />CBOL control of BARCODE flag<br />Data records published in BOLD<br />